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ABSTRACT 



This study used ninety-one subjects in an attempt to specify 
social and acoustic variables which function significantly in the racial 
identification and rating of Negro and white speakers by Negro and 
white listeners. Eighty-six subjects, forty-three white and forty-three 
Negro, provided the listener responses. Subjects were chosen to 
provide a sample approximately representative of the distribution of 
socioeconomic status scores in Southeastern United States. 

Listeners were asked to judge the race and overall speech 
proficiency of speakers from listening to a recorded reading passage. 
Comparative control was exercised over the quality ratings through use 
of a semi— obj ective articulatory product score which provided an 
independent index of speech proficiency. Additional independent 
variables included the socioeconomic status score; sex; age, number of 
articulation errors divided into substitutions, omissions and distortions, 
number of mis articulated phonemes and a self-rating of speech 
proficiency. All speaker and listener data were gathered under 
controlled laboratory conditions. Analysis was carried out through 
analysis of variance and co-variance using multiple regression 
technique to determine variables which might be significant in 
predicting racial identity perception and quality rating of speakers. 

A spectrographic analysis was carried out using a sample of the 
sample consisting of ten Negro male and ten white male subjects. All 
speakers used in this analysis had been correctly identified by 
listeners as to race 95% of the time or better. 

The purpose of this phase of the study was to specify spectral 
data in the resonance characteristics of speakers as seen in two selected 
vowel sounds which might function significantly in listener perception 
of racial identity and the quality rating of speakers. An intergroup 
comparison was carried out on the acoustic variables of formant 
frequency and relative formant amplitude from spectrographic displays 
of the (i) and (u) vowels. 

The results can be summarized as follows; 

1. Number of phonetic distortions is significant in predicting listener 
identification of the race of speakers from recorded speech samples. 

2. Socioeconomic status score and articulatory product score are 
significant factors in predicting speech quality ratings received by 
Negro and white speakers from Negro and white listeners 

3. No significant intergroup differences were found in the comparison 
carried out on acoustic variables from spectrographic displays. The 
Negro speakers were found, however, to have consistently lower relative 
formant frequencies than the white group. 
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CHAPTER I 



INTRODUCTION 

Investigations directed at specification of relationships 
between man and his environment have occupied a major part of 
the history of science. Although it has been said that the proper 
study of man should focus on man himself, it has been realized 
that the nature of the human organism and its behavior cannot 
be properly viewed as separate from environmental stimulation 
and interaction. This consideration has formed the basis of the 
historic nature -nurture controversy in the study of human develop- 
ment. Studies of the ways in which man responds to and interacts 
with environmental stimulation have made considerable contri- 
bution to human welfare and the accumulated information of the 
behavioral sciences. 

There has been particular interest on the part of scientists 
interested in human behavior in studies of the sensory-perceptual 
responses of the human organism to physical stimuli. Such interests 
in the nature of sensory communication have , according to S . S . Stevens 
(46), created the hundred -year-old discipline of psychophysics in 
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which the principle concern is with the responses made by 
organisms to the stimuli of the environment and the speci- 
fication of these consequences. 

The consequences of acoustic stimuli are c f particu- 
lar interest since they constitute a significant portion of the 
general interactional pattern through which man communi- 
cates with his environment. Psychophysics has routinely 
investigated loudness, pitch, and perceived duration as 
consequences of relatively simpxe acoustic stimuli. These 
investigations have provided the basic information available 
today on the differential response characteristics of the 
human organism to sound stimuli. In general, however, 
according to Voiers ( 52) , the level of information regarding 
responses of the human auditory system to complex acoustic 
stimuli such as speech is less precise end cannot be fully 
specified in terms of elementary attributes of auditory sen- 
sation. 

The sensory-perceptual consequences of speech signals 
in man can be viewed and studied in terms of the informational 
content of such signals. Following the design of classical 
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psychophysical studies, it is possible for investigators in 
linguistics, psychology, acoustic phonetics and related 
disciplines to present speech signals in which careful 
control is maintained over signal characteristics and 
signal source variables and infer features of the infor- 
mational content of such signals from observation of subject 
responses. The literature of these disciplines offers numerous 
examples demonstrating the broad interest of investigators in 
the diverse information carried by speech signals in human 
communication. 

Ladefoged (29) has offered a three-part classifi- 
cation of the kinds of information conveyed in speech signals. 
It is believed that such classification is an important part of 
the general research effort directed toward the specification 
of what Peterson (37) has called the "information-bearing 
elements of speech. " Ladefoged says that when we hear a 
person talking we perceive the linguistic content of his 
message. This constitutes the class of linguistic features 
in speech signals which provides the essential linguistic 
information enabling the listener to know what the speaker 
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has said. In addition to this class, the speech signal may 
aiso tell the listener something about the general background 
of the speaker or provide specific information about the place 
of origin of the speaker, his group membership and his status 
within the group. This constitutes the class of group features 
in speech signals which provides what Ladefoged has desig- 
nated socio-linguistic information. A third class, idiosyn- 
cratic features, provides personal information about the 
speaker. Such features, according to Ladefoged, may be 
attributed to anatomical and physical characteristics of the 
individual speaker such as the shape, size and coupling of 
resonance cavities of the vocal tract. On the other hand, 
socio-linguistic information is conveyed by those features 
of a person's speech which have been acquired through the 
influence of particular groups in which the speaker is or 
has been a member. It is possible, Ladefoged concludes, to 
structure studies designed to specify these classes of in- 
formational content either singly or in combination. 

The study reported here was designed to contribute 
psychoacoustic information in the area of sensory-perceptual 
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consequences of speech signals. The personal and socio- 
linguistic information of such signals was investigated ex- 
perimentally through the study of variables believed to interact 
significantly in the identification by listeners of the race of 
speakers. An attempt was also made to specify acoustic, 
social, and personal variables which may constitute a basis for 
the quality judgement of speakers by listeners. 
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CHAPTER II 



STATEMENT OF THE PROBLEM AND BACKGROUND STUDIES 

Although numbers of studies have been conducted to 
specify the informational content of speech, the majority of 
these have focused on those features believed to provide the 
essential linguistic information in speech communication. 

Few experimental studies have been found to focus on idio- 
syncratic features providing personal information and group 
features providing socio-linguistic information. Generally, 
such studies have not included^as part of the experimental 
design, spectrographic analysis intended to indicate possible 
acoustic perceptual variables in the information processing 
carried out by listeners. It was believed desirab 1 ^ that such 
comprehensive studies be conducted within the disciplinary 
framework of the speech science laboratory and that they 
reflect interest in both basic science and the important role 
of human communication in problems of society. 
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I. THE PROBLEM 

Information content analysis. The problem in infor- 
mational content analysis formulated for this study was in 
two a;oas: (1) specification of those idiosyncratic and group 
features which provide personal and socio-linguistic infor- 
mation to listeners regarding the racial identity (Negro or 
Caucasian) of the speaker; and (2) specification of those 
idiosyncratic and group features which provide personal and 
socio-linguistic information used by listeners in making 
overall speaker quality judgements. 

Social dialect analysis. Many studies in the area 
of dialectology have been carried out from the " encoder 1 
point of view. The efforts of Kurath (27) and others in 
describing the space of dialect geography in the United States 
and correlating varying speakers pronunciations with particu- 
lar locations provide a classic example of focus on the 
encoder in dialect study. In a recent compilation Hymes (21) 
presents a number of classic papers in this area of research. 
Other investigators such as Harms (16) and Voiers (52) have 
stressed the need for such studies to focus upon the differential 
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listener ("decoder") consequences related to the phonological 
patterns of speech. According to Anisfeld, Bogo, and Lambert (2), 
listener reaction consequences include such features as differen- 
tial personality evaluation of the speaker. In their study, listener 
judgement as to personality characteristics of speakers was found 
to be significantly dependent upon whether or not the speaker 
spoke with "pure" or "accented" English. The conclusion was that 
accented English "... aroused certain perceptual hypotheses 
which had been acquired through previous experience with people 
who speak English with an accent." (p. 228) Apparently speech 
differences have been found to arouse certain stereotype reactions 
*n listeners. 

These findings were extended in a recent study reported 
by Markel, Eisler and Reese (32) which was designed to determine 
whether stereotypic personality judgement reactions take place 
in oral communication between native speakers of somewhat 
differing dialects. The results were the same as those obtained 
in the previous study of reactions between native and non-native 
speakers. Perceived dialect variation was sufficient to stimulate 
a stereotypic judgement concerning the personality characteristics 
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of the speaker. These studies have not, however, provided 
analysis of acoustic, social or personal variables which may 
function in the stimulation of such judgements. 

It has been recognized by Golden (12) , and others 
concerned with social dialect research as it relates to the 
successful enculturation of minority groups, that stereotypic 
judgements stimulated by speech differences play an important 
role in the nature of prejudice and negative reactions in inter- 
personal relationships. The problem in social dialect analysis 
formulated for this study concerned an attempt to specify signifi- 
cant factors which may function in inter-racial identification 
and rating of speakers by listeners. It was believed that such 
specification would contribute to efforts to provide equal edu- 
cational, social, and vocational opportunity and general upward 
mobility to sub-culture groups. In commenting on ethnolinguistic 
studies inthis area, McDavid (34, p. 247) has said: 

In making such investigations, the linguist 
does not assume that the mere recording of the 
fact will by itself resolve the tensions; he 
insists, however, that a framework of fact will 
be useful to those who seek objective dis- 
cussion of the problem at issue. 
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Spectrographic analysis . The acoustic study of speech 
signals through spectrographic analysis has identified parameters 
which are believed to be important variables in information con- 
veyed in human oral communication. The typical speech sound 
spectrogram, as developed in the work of Potter, Kopp and Green 
(40), provides an instrumentally generated display of the dis- 
tribution of sound energy in the frequency spectrum across time. 
An additional representation is possible which displays the 
relative amplitudes within the spectral energy envelope at any 
chosen point in time. Important acoustic parameters identified 
through speech sound spectrography have included resonance 
regions which are seen as components in the distribution of 
sound energy in vowel sounds. These resonances are called 
formants and are thought to represent normal modes of vibration 
of the cavities of the vocal tract. 

Peterson (38, p. 182) has said that, "If the vocal 
mechanism is considered to be the fundamental information source 
in speech, then measurements of the acoustical signal which 
most directly reflect its properties are of primary significance." 
The spectrographic analysis problem formulated for this study 
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concerned an attempt to specify any significant intergroup 
differences existing in the spectral characteristics of vowel 
sounds which may be related to the racial identification and 
quality judgement of speakers. 

II. BACKGROUND STUDIES 

Informational content analysis. Investigators in 
psychology, linguistics, education, and other fields have 
long believed that speech signals provide basic personal and 
social status information concerning speakers. Gray and Wise 
(13, p. 11) have said, for instance, that "... much of what 
we have called personality is found, when it is carefully 
analyzed, to be resident in the voice." 

Many studies have demonstrated the extent to which 
listeners infer information in speech from the idiosyncratic 
and personal features described by Ladefoged (29) . Although 
the results of some of these studies are not directly pertinent 
to the research reported here, they do indicate the range of 
interest in informational content analysis. H. C. Taylor (48), 
for instance, reported a study on the extent to which listeners 
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agree on the personality traits of speakers from listening to 
their voices. Stagner (45) studied the relationship between 
judgements of voice and personality. Eisenberg and Zalowitz (10) 
have reported on the extent to which listeners are able to judge 
dominance-feeling of speakers from phonograph recordings of 
voices. A recent study by Ptacek and Sander (41) demonstrated 
that listeners are able to make accurate gross identification of 
the age of speakers under a variety of listening conditions. 

A series of studies carried out by Harms attempted to 
investigate the information carried in speech signals relative 
to the social status of speakers. The most recent (18) report 
concluded that the signal apparently carries valid informational 
content on the social status of the speaker. A high correlation 
was found between listener judgements of socioeconomic status 
and objectively obtained status scores. In an earlier study (17), 
listeners from different social status groups heard short recorded 
messages from speakers of different social statuses. After 
listening, the respondent attempted to replace words which had 
been systematically deleted from a written version of the messages. 
Based upon degree to which listeners were able to replace the 
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deleted words , speakers of high status were found to be the 
most comprehensible . The same criteria also indicated that 
listeners were most successful in comprehension when re- 
sponding to speakers of their own status rating. In an 
additional study more directly reload to the problems posed 
in this research, Harms (16) found that listeners from different 
social strata were capable of rating the social status of an 
individual after hearing ten to fifteen seconds of his tape 
recorded speech. Listeners also rated the high-status speakers 
as more credible than low-status speakers. V r hile these studies 
make an excellent contribution, it can be noted that apparently 
speakers of differing status groups were treated categorically 
on the matter of speech proficiency. There was also no re- 
ported attempt to differentiate the function of acoustic variables 
in the speech signals. 

Several recent studies of the informational content of 
speech signals have focused on the basis of speaker recog- 
nition and identification by listeners. Holmgren (19) attempted 
to specify some of the physical and psychological correlates 
of speaker recognition and concluded that listeners can 
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differentiate reliably among speakers on the basis of judged 
voice characteristics. 

The aim of the previously cited study by Voiers (52) 
was to identify the information bearing elements in complex 
speech signals which are available and used by listeners in 
making identification judgements of speakers. A secondary 
aim was to specify what the author calls extrastimulus 
factors” operative in the perceptual responses of listeners 
to voice stimuli. Such factors include listener biases and 
idiosyncratic listener errors based upon particular kinds of 
listener-speaker interaction which cannot be solely speci- 
fied in terms of acoustic constituents of the input signal. 
Presumably, this would include listener preparatory perceptual 
sets based upon personal feelings and past history. 

A study reported by Dickens and Sawyer (7) in South- 
eastern United States in 1952 was directed at problems simi- 
lar to those posed in this study. The results have implica- 
tions both in the area of informational content and social 
dialect analysis. The authors were interested in investigating 
perceived differences in vocal quality using Negro and white 
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speakers and listeners. In contrast with the research reported 
here, however, the Dickens and Sawyer study did not consider 
social or educational variables or attempt to compare perceived 
speaker quality with speaker quality determinations obtained 
through more objective means. Since vocal quality was con- 
sidered to be solely a quality of listener perception rather than 
a physical property of sound, no attempt was made to correlate 
listener judgements with acoustic features of the speech signal. 
Twenty college students served as speaker subjects and members 
of college public speaking classes served as judges. 

Although the research population and number of variables 
was quite distinct from that of the present study, some of the 
results may provide useful comparison. The authors found, for 
example, that there was approximately seventy per cent correct 
identification of the race of the speakers; that the white observers 
were more accurate in racial identification than the Negro observers; 
and that there was significantly greater accuracy shown by observers 
in identifying speakers of their own race. Additional findings indi- 
cated greater accuracy in identifying the race of male speaker. 

The combined judgement of all listeners rated Negro females and 
white males as highest in vocal quality. Of particular interest 
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was the finding that racial bias in quality rating was of low 
statistical significance and that the amount of bias that was 
present favored voices of the other race. No explanation was 
offered for this latter result. 

It is apparent that considerable research effort has been 
invested in systematic study of idiosyncratic and group features 
which convey personal and sociolinguistic information in speech 
signals. None of the investigations reviewed, however, carried 
out an analysis of a broad range of social, educational, economic, 
acoustic and vocal quality variables constituting specific infor- 
mational content of speech signals used by listeners in per- 
ception of speaker racial identity and vocal quality. 

S ocial dialect analysis. There is considerable current 
interest on the part of educators, social psychologists, sociolo- 
gists, linguists, and others in the effect of speech and language 
differences on the educational, vocational and psychological 
welfare of children and adults of ethnic minorities. This, interest 
has grown as society has experienced problems associated with 
increased social, educational, and economic desegregation. 
Research studies arid educational program related to these 
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socio -cultural changes have focused particular attention on 
the language and phonological factors of human communication. 

It is generally believed that these are important factors to be 
considered in efforts to bring about full participation by members 
of isolated minority groups in the educational and vocational 
opportunities of society. 

This importance has been further emphasized in recent 
work of this investigator and the Department of Speech Pathology 
and Audiology, University of Virginia in programs supported by 
Title IV of the Civil Rights Act of 1964. In 1965-1966 the 
Charlottesville, Virginia City Schools received Grant Number 
OE-6-36-5 6-008 under this Act to develop programs to counter- 
act certain problems which were believed to be associated with 
continued racial desegregation. One of these problems involved 
complaints about the speech patterns of Negro teachers in 
newly integrated faculties and schools. At the request of the 
Superintendent of Schools , a special non-credit academic program 
was designed by this investigator within the Department of Speech 
Pathology and Audiology, University of Virginia to answer the needs 
of individual teachers and the School Division. It is hoped that 
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the findings reported here will provide information basic to the 
solution of such problems and to the continued function of 
educators and speech scientists in facilitating socio-cultural 
changes in society. 

Interest in the effect of speech differences on opportu- 
nities available to minority group members has been expressed 
by many and this interest has stimulated increased research in 
the area of socio-linguistics. According to Green (14), the 
non-standard speech of the majority of American Negroes can 
be seen as the major obstacle to successful entrance into a 
predominantly white world. Francis (11) has noted, in work 
being conducted by Northern universities in traditionally 
Negro colleges, that Negro students speak a dialect considered 
by some to be socially inferior. The application by linguists 
and speech scientists of various methods of social dialect 
analysis to this problem has resulted in much recent and con- 
tinuing research. 

An extensive analysis of dialect related barriers to 
communication was recently reported by McDavid, et al. 

(33) from the Chicago, Illinois area. In the section of the 
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study conducted by Larson end Larson (31) on listener re— 
actions to pronunciation it was found that the pronunciation 
patterns of Negroes were generally rated as more unpleasant, 
less educated, and less urban than white pronunications . 
Listeners tended to favor white pronunciations and were able 
to distinguish between white and Negro speakers even when 
the pronunciations were very similar. A finding of particular 
interest was that Negro judges tended to agree with white 
judges. The authors interpret this to mean that many Negroes 
may implicitly accept the white standard of pronunciation as 
more valuable. This finding is in contrast to that reported 
by Dickens and Sawyer (7) in the study cited previously. 

The McDavid, et al. ,(33) study may be partially 
supported by social dialect studies conducted by Labov, 
Cohen and Robins (28) in the New York City area. These 
investigators reported that there may be an unconcious con- 
flict of values in the speaker of a non-standard dialect. It 

is stated (p. 23) that: 

... it is possible for a lower-class 
speaker to participate in the full socio- 
linguistic structure of a speech communi- 
ty, and possess a good knowledge of the norms 
of careful speech, yet be unable or unwilling 
to use these forms in speech or writing. 
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Linguis t conducting social dialect studies appear to be 
in increasing agretnent on the existence of a Negro dialect as 
a legitimate dialect of English. Some believe that certain features 
of the pattern may transcend traditional dialect geography bounda- 
ries. In the Labov/ et al. (28) study / for instance, it is stated 
(p. 23) that: 

a 

The grammatical patterns underlying deviations 
from standard English for Negro subjects of the 
Lower East Side are not characteristic of a particu- 
lar local or regional dialect, but have been found 
in Harlem, Chicago, Cleveland, Philadelphia, 

Boston and South Carolina, also. 

Studies conducted by Harlan Lane and associates (30) 
of the Center for Research in Language and Language Behavior, 
University of Michigan have concluded (p. 20) that: 

Recent linguistic research has shown that the speech 
patterns of southern Negroes constitute a legitimate 
dialect of English with grammatical (including phono- 
logical) rules somewhat different from General Ameri- 
can English (GAE) . 

Some of their reported research has been directed at determining 
whether the distinguishing characteristics of Negro dialect lead 
to differences in the ways in which Negroes perceive speech. 

It was found that speakers of Southern Negro dialect are not as 
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accurate as Caucasians of the same geographic and socioeconomic 
bacKground in correctly perceiving General American English. This 
finding was supported in a earlier study by Caroline (5) in which it 
was found that there are statistically significant differences in in- 
telligibility scores between the white and Negro students. 

The effect of these dialect related speech and language 
differences on communication and learning ability in educational 
settings is being considered by Weener (54) . Some educators 
have been concerned about the possible problem of communication 
in classrooms in which there is dialect disparity between the teacher 
and the pupil. Weener is currently engaged in research directed at 
specification of the effects of dialect differences between teachers 
and pupils on the immediate recall of verbal messages. 

Hurst (20) has investigated the psychological and socio- 
logical correlates of dialect difference and has used the term 
» dialectolalia" to refer to dialect related speech differences which 
are so non-standard that they have a potentially negative effect on 
the psychological, educational, and vocational welfare of the indi- 
vidual. He has referred to the work of Anisfeld (1) who found that 
listeners tend to be influenced negatively by stereotypes which are 




22 



reinforced by the speech patterns of speakers. In an interesting 
study conducted by Stroud (47) and cited by Hurst it was found that 
judges were able to discriminate between recorded voices of Negro 
and white students in ninety-three per cent of the cases. It was 
also found that, as socioe conomic status went up, there was a 
reduction in identification errors. 

A study conducted by Edmonds (9) has provided some 
information on the function of socioeconomic and sex differences 
in verbal ability among Southern Negro high school students. 
Socioeconomic status was found to have a greater relationship to 
verbal ability than the sex of the speaker. This study showed 
no significant differences between the verbal abilities of males 
and females within the deprived group. 

The only study found to focus on phonological analysis 
was that reported to be currently in progress by Shuy (44) . An 
interim report has stated that (p. 73) the research is attempting to 
carry out a "contextual phonological analysis" to investigate 
"hypotheses concerning phonological correlates of stratification . . . 
The investigator is particularly interested in the presence, absence 
and substitution of nasal components and the ways in which these 
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factors may relate to the social level of the speaker. 

In 1952 Currie (6) made a strong plea for consciously directed 
research in the area designated socio-linguistics. Although much has 
been done and impetus provided by changing social patterns and emerg- 
ing social problems has greatly stimulated the activities of linguists 
and speech scientists, there are numbers of unresolved questions re- 
garding the interrelationship of significant speaker and listener variables 
and specific analysis of intergroup and intragroup acoustic differences. 

None of the studies reviewed have reported a systematic 
specification of the educational, vocational, socioeconomic, and 
acoustic variables which function in the listener identification of the 
Negro speaker as a Negro speaker. There has been a particular lack 
of attempt to measure and differentiate the function of levels of ob- 
jectively determined speech proficiency among Negro speakers which 
relate to identification and rating by listeners. Is there a point, for 
instance, on a scale of speaker proficiency at which stereotypic, 
negative reactions of listeners are seen to change? None of the 
studies reviewed have considered a broad range of listener as well 
as speaker variables which may be significant in the perception of 
racial identity and quality judgement. No studies have been found 
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which have extended acoustic analysis to a comparative spectro- 
graphic investigation on speakers who hav^ been consistently 
identified as to race by both Negro and white listeners. 

Spectrocrraphic analysis . A preliminary investigation con- 
ducted in tte Speech Science Laboratory of the University of 
Virginia was pertinent to the currently reported research. The 
results of the preliminary study indicated the possible efficacy of 
research designed to specify variables in listener identification and 
rating of speakers which may be a function of the distribution of 

acoustic energy within the speech spectrum. 

The study, The Effect of Signal Bandwidth Compression on 

Listener Perception of Racial Identification, by Bryden (4) , used 
spectral filtering in which the speech signal was compressed to a 
500 Hz bandpass between 1250 Hz and 1750 Hz. Although frequency 
distortion procedures in which the bandwidth is compressed or interrupt- 
ed through electrical filtering are most often carried out for the purpose 
of determining the effect on signal intelligibility , this study employed 
controlled distortion technique to determine only the perceptual effect 
on whatever information bearing elements there may be in the speech 
signal which provide significant listener cues regarding the racial 

identity of the speaker. 
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A sample of twenty speakers , ten white and ten Negro , was 
recorded using a standard reading passage. T/vo experimental tape 
recordings were prepared with the same speakers appearing in 
different randomly determined order. One of the tapes was subjected 
to controlled signal bandwidth compression procedures. The null 
hypothesis stated for the study was that such procedures would 
not significantly effect listener ability to make correct identifications 
of the race of the speakers. It was speculated that it might be 
necessary to accept the null since it was generally believed that 
informational cues for racial identity would not be a function of 
spectral energy distribution but rather a matter of stress related to 
vocal effort. 

A group of twenty listeners, ten white and ten Negro, 
listened to the two tapes and attempted to record a correct identifi- 
cation of the race of each speaker. A forced choice condition was 
used. Responses were scored on the basis of percentage of correct 
judgement. The mean correct listener response on the unfiltered 
tape was seventy-four per cent. The listener response to the 
filtered tape produced lower scores with a mean difference of 6.25. 
This difference was significant at the .01 level of confidence 
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indicating that the spectral filtering significantly disturbed the 
listener's ability to make correct judgements of the race of the 
speakers. The null hypothesis was rejected. 

It was apparent from this preliminary study that there may 
be important informational cues for the racial identification of 
speakers by listeners which were removed through the application 
of controlled signal distortion technique. The creation of the 500 
Hz bandpass between 1250 Hz and 1750 Hz was sufficient to alter 
the perceptual ability of listeners even though a number of the 
listeners reported that they consciously focused their perceptual 
set on certain non- segmental elements such as vowel distortion 
and prosody. 

The pilot study suggested that further research might be 
conducted to investigate the extent to which listeners are able to 
make auditory perceptual judgements about the racial identity of 
speakers and the significant listener and speaker variables asso- 
ciated with these judgements. It was believed that a comparative 
analysis might identify inter-speaker acoustic differences which 
function in such identification. 
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The question of whether there may be intrinsic physical 
differences in the dimensions of the vocal tract which would be 
reflected in the arrangement of the harmonic components of the 
speech signal and which might account for what has come to be 
identified as Negro dialect, was raised by Claude M. Wise (55) 
in an article published in 1933. Wise said (p. 523) at that time: 

. . . the characteristic Negro vocal quality 
seems to result from a tongue position which 
may possibly be a heritage from the original 
African speech. 

Wise goes on to say (p. 523) that: 

This quality surely cannot result from any 
peculiar physical formation of Negro reso- 
nance cavities, for northern Negroes, reared 
air ong a majority of whites, have nothing of 
this Negro voice quality. 

Apparently, it was believed that the " tongue position" was a learned 
behavior, which was a part of the cultural-linguistic history of the 
American Negro rather than a reflection of intrinsic physical differences. 

The spectrographic analysis reported in this study compared 
the speech of a group of male speakers consistently identified by 
listeners (Negro and white) as being Negro speakers and the speech 
of a group of male speakers consistently identified by listeners 
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(Negro and white) as being white speakers. The speech samples 
were analyzed by measurement of the variables of vowel formant 
frequencies and relative vowel formant amplitudes and inter- 
group comparison was conducted in an attempt to specify sig- 
nificant differences. 

According to a recent study reported by Dixon (8) , such 
measurements are usually made from amplitude sections of vowel 
sounds using the narrow band filter on a sound spectrograph to 
resolve the individual harmonics (24,25). From such displays 
it is possible to determine the location of formants on the frequency 
scale and make measurements of the relative amplitudes of the 
formant frequency peaks. Although there has been some contro- 
versy concerning the best method of determining formant frequency 
location, Peterson (39) has concluded that, in the light of present 
knowledge, it is reasonable to define the formant frequency as the 
frequency of the harmonic with the greatest amplitude peak. There 
is general agreement, according to Dixon (8) , that the relative in- 
tensity of formants is determined by measuring the number of deci- 
bels from the peak of one formant to the peak of another. 
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Since the spectrographic analysis reported here involved 
specification of mean differences in intergroup comparison, rather 
than a descriptive specification, it is not considered necessary 
to detail the many studies in acoustic phonetics which have been 
concerned with attempted descriptions of invariance in speech 
perception, A search of the literature failed to reveal any pre- 
vious studies is which speech sound spectrography has been used 
for intergroup comparison in social dialect analysis. 

III. QUESTIONS 



Experimental treatment of the problems discussed previously 
in informational content analysis, social dialect analysis and 
spectrographic analysis was designed to answer the following 
questions: 

1. What is the function of the actual race of 
speakers and the actual race of listeners 
in the racial identity of speakers perceived 
by listeners on the basis of acoustic in- 
formation? 

2. What is the function of the sex of speakers 
and the sex of listeners in perception of 
racial identity? 
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3. What is the function of the socioeconomic 
status of speakers and of listeners in 
perception of racial identity? 

4. What is the function of such speaker and 
listener proficiency indicators as semi- 
objective articulatory product score, 
number of articulation errors and number of 
misarticulated phonemes in perception of 
racial identity? 

5. What is the function of speaker and 
listener self-rating of speech pro- 
ficiency in perception of racial identity? 

6. What is the function of the overall per- 
ceived race of speakers in quality 
rating of speakers by listeners? 

7 . What is the function of the race of 
listeners in overall quality rating 
of Negro and Caucasian speakers? 

8. What is the function of the sex of 
speakers and of listeners in overall 
quality rating of Negro and Cau- 
casian speakers? 

9 . "What is the function of the socio- 
economic status of speakers and 

of listeners in overall quality rating 
of Negro and Caucasian speakers? 

10. What is the function of such speaker 
and listener proficiency indicators 
as semi-objective articulatory product 
score, number of articulation errors 
and number of misarticulated phonemes 
in overall quality rating of Negro and 
Caucasian speakers? 
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11. What is the function of speaker and listener 
self-rating of speech proficiency in overall 
quality rating of Negro and Caucasian 
speakers? 

12. Are there significant mean differences on 
selected acoustic variables between a 
group of speakers consistently judged 
by listeners to be Caucasian and a group 

of speakers consistently judged by listeners 
to be Negro? 

The experimental design and procedures developed to pursue 
these questions are described in the next chapter. 



CHAPTER III 



PROCEDURES AND PERFORMANCE OF SUBJECTS 

The experimental design and procedures developed to 
answer the questions listed previously can be conveniently 
viewed in two sections: (1) en analysis of the function in 
racial identification and rating of speakers by listeners of 
such listener and speaker variables as race, sex, socio- 
economic status, measured speech proficiency and self- 
rating of speech proficiency, and (2) a comparison of 
speakers perceived by listeners to be Negro with speakers 
perceived by listeners to be Caucasian on such spectro- 
graphically displayed acoustic variables as vowel formant 
frequencies and relative vowel formant amplitudes. This 
latter section was designed to use a sample of the original 
sample in an attempt to identify spectral variables which 
may function significantly in the racial identification of 
speakers by listeners. 

Although evidence presented by Labov (28) would 
place little restriction on the extent of generalization possible 
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from studies of Negro dialect/ the population specifically 
designated for this study was Area 15 of the Linguistic 
Atlas of the Eastern United States (27). The sample was 
drawn from the City of Charlottesville and surrounding 
counties. This area is in the approximate geographic center 
of Area 15 which includes most of central and Eastern Vir- 
ginia/ and large sections of Maryland and North Carolina. 
According to reports published by the Bureau of Population 
and Economic Research/ University of Virginia (49 # 50) # the 
Charlottesville -Albemarle area has a total population of 
74/766 and a per capita income level of $2/368.00 per 
year. The presence of the University of Virginia, widely 
diversified light and heavy industry and the usual laboring 
occupational classes provide access to a wide distribution 
of socioeconomic and educational achievement levels in 
both Negro and Caucasian population groups. 



I. EXPERIMENTAL DESIGN 

A. ANALYSIS OF THE FUNCTION OF LISTENER AND SPEAKER 
VARIABLES IN RACIAL IDENTIFICATION AND RATING O F 
SPEAKERS 

Listener Response Data 

1. Listener identification of the race of the speaker 
(Caucasian or Negro) 

2. Quality rating of speakers by listeners (good 
speaker - average speaker - poor speaker) 

Data Collected on Subjects Used as Speakers and Listeners 

1 . Social and personal 

a) Race 

b) Sex 

c) Socioeconomic status score representing: 

( 1) Occupation 

(2) Educational level 

(3) Family income 

2. Non-spectral acoustic 

a) Articulatory product score representing: 

(1) Speaking time duration 

(2) Number of misarticulated words 

b) Total number of articulation errors representing: 

(1) Number of distortions 

(2) Number of omissions 

(3) Number of substitutions 
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c) Total number of misarticulated phonemes 

d) Self-rating of speech proficiency (good speaker- 
average speaker-poor speaker) 

B. INTERGROUP COMPARISON OF SPEAKERS 

A sample of the sample used for the previous analysis 
was drawn at random from a pool consisting of male speakers 
who had been consistently identified as being Negro speakers 
and male speakers who had been consistently identified as 
being Caucasian speakers. This procedure produced two groups 
of speakers: (1) ten male speakers who had been correctly identi- 
fied by listeners 95% or better as being Negro speakers and, 

(2) ten male speakers who had been correctly identified by 
listeners 95% or better as being white speakers. Intergroup 
comparison of means was carried out on the following spectro- 
graphic variables to identify significant differences. Rationale 
for selection of the vowels to be studied is discussed in the 
section on spectrographic procedures: 

1 . Formant frequencies on (u) and( i ) 

a) Mean F x frequency 

b) Mean F 2 frequency 



c) Mean F 3 frequency 




2. Relative formant amplitudes on (u) and (i) 
(Reference point = peak) 
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a) Mean F^ - F 2 (Difference) 

b) Mean F 2 - F 3 (Difference) 

c) Mean F x - F 3 (Difference) 

II. PROCEDURES 

Sampling procedures . Su.\ ects were chosen at random 
within socioeconomic status categories representing the popu- 
lation of Southern United States (51) . Population sources for 
sampling included senior high school students , university 



students and job applicants at the Virginia State Employment 
Bureau. The following criteria was adopted in selection of 
subjects for this study: 

A. Age. Subjects were adults between the ages of 17 and 65. 
The lack of significant variability in vocal pitch levels of 
adults between these ages has been supported by research 
reviewed and reported McGlone and Hollien (35) . 

B. Sex. The sample contained both male and female subjects. 
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C. Race. The sample contained an approximately equal 
number of Negro and Caucasian subjects. 

D. Socioeconomic status. Subjects were selected to provide 

$ 

both Negro and Caucasian representation from low, 
middle and high status levels. 

E. Hearing. Subjects had normal hearing in both ears as 
determined by pure tone and audiometric screening 
using a frequency sweep at 25 dB (ISO). Rejection 
criteria was failure to hear one or more frequency in 
either ear. 

F. Speech. Subjects were free from gross organic pathologies 
of speech such as cerebral palsy, cleft palate and cere- 
brovascular trauma (determined by interview) . 

G. Reading ability. Subjects were literate as determined by 
ability to read the test passage with no problem in word 
recognition apparent after two practice trials, (determined 
by interview) . 

H. General Health. Subjects were free from any extreme oral, 
nasal or pharyngeal congestion (determined by interview) . 

I. Linguistic geographic background. Subjects were lifetime 
residents of Area 15 as described above (determined by 




interview) . 
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J. Speech training . Subjects had no formal speech training 
beyond a high school or college public speaking or voice 
and diction course . 

Sampl e description . Table 1 indicates that a total of 91 



TABLE 1 

NUMBER OF NEGRO AND WHITE SUBJECTS USED AS 

SPEAKERS AND LISTENERS 



Type of Subject 


Negro 


Race 

White 


Total 


Speaker 


47 


44 


91 


Listener 


43 


43 


86 



subjects participated in this study. With the exception of five 
who did not return, all subjects served as both speakers and 
listeners. The sample of listeners providing the response data 
referred to in the experimental design contained 43 Negro and 
43 white subjects. 

Table 2 describes the mean age and age range of the 
subjects and Table 3 offers a description of the sex of the sub- 
jects by race. 




m 



k 



> 




39 



TABLE 2 



MEAN AGE AND AGE RANGE OF SUBJECTS 


Race of Subjects 


Mean Age 


Range 


Negro 


30 


17 - 59 


White 


21 


17 - 29 


All Subjects 


25 


17 - 59 



TABLE 3 

SEX OF SUBJECTS 




Sex 






Race of Subjects 


Male 


Female 


Total 


Negro 


26 


21 


47 


White 


31 


13 


44 


All Subjects 


57 


34 


91 
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Table 4 indicates the means and ranges of socioeconomic 
status scores calculated for Negro and white subjects. Although 
considerable disparity can be noted between the mean score for 

Negro subjects and the mean score for white subjects# Table 5 

illustrates the extent to which this difference reflects the per- 
centage distribution of such scores in the population of Southern 

United States. The greatest problem in sample stratification on 

TABLE 4 



SOCIOECONOMIC STATUS SCORES OF SUBJECTS 



Race of Subjects 


Mean SES Score (51) 


Range 


Negro 


43.34 


9-98 


White 


60.78 


13 - 99 


All Subjects 


51.78 


9-99 



TABLE 5 

PERCENTAGE DISTRIBUTION OF SOCIOECONOMIC STATUS SCORES IN 
THE SAMPLE AND IN THE POPULATION OF SOUTHERN UNITED STATES 



Score Categories 
(Range=0-100) 


Percent in 
Population (51) 


Number in 
Sample 


Percent in 
Sample 






Distribution of Negroes 




80 


- 99 


1.60 


10 


21.27 


50 


- 79 


8.10 


4 


8.51 


20 


- 49 


41.80 


23 


48.93 


0 


- 19 


48.50 


10 


21.27 






Distribution of Whites 




80 


- 99 


12.10 


9 


20.45 


50 


- 79 


36.60 


18 


40.90 


20 


- 49 


37. 00 


15 


34.09 


0 


- 19 


14.20 


2 


4.54 
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th© socioeconomic status variable was matching the sample to 
the population at the extreme ends of the scale. It will be noted 
that# although reasonable close matching exists in the middle 
categories/ there is some disparity at the extremes of the scale. 
Consideration was given to dropping a number of subjects from 
the upper socioeconomic category of each subject group. This 
was not done, however, as the upper groups were found to be homo- 
geneous on the experimental variables. It was not believed, 
therefore, that the numerical weighting of this group would sig- 
nificantly effect the analysis. In addition, it was felt desirable 
that sample size be kept as large as possible. 

Interviewing p rocedures. Subjects were interviewed by 
the Investigator or the Research Assistant. Items in the criteria 
listed previously were checked to determine eligibility and in- 
formation for the Questionnaire Form was requested from each 
subject (see Appendix A). The information item regarding ’'race" 
did not appear on the Questionnaire Form but was coded by the 
interviewer as part of the eligibility checklist which appeared in 
a box in the upper right hand corner of the form. The form was 
constructed to provide all information necessary for the personal 



and social variables. 
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Subjects were told that we were collecting samples of 
the voices of many speakers to check on the quality of our tape 
recording. Each subject was assigned a number taken at random 
from tables. The number appeared on the Questionnaire Form, 
the data form and as a spoken record on the subsequent tape 
recording. At the end of the interview subjects were given a 
hearing screening test and a copy of the test passage. They were 
asked to read the test passage over silently and ask about any 
unfamiliar words. They were then asked to read the passage out 
loud once for practice. 

The test passage. The test passage adopted for this 
study was the one offered by Guttman (15) to accompany the 
Articulatory Product formula. This formula will be discussed in 
detail in a later section dealing with the recording of data. The 
test passage is as follows (p. 325): 

Many people want to have relatively heavy 
breakfasts that include a rich sweet such as 
cake. Others purposely restrict themselves to 
a glass of orange juice. Some frequently go 
without a morning meal. Do those who eat lightly 
lunch early? 
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Guttman has stated (p. 325) the following criteria for the 
development of the test passage and has pointed out the findings 
of Morrison (36) , Sherman and Morrison (43) , and Sherman and 
Cullinan (42) that specimens of speech at least as short as ten 

seconds can be rated reliably: 

(1) Words of uncertain syllabic number (e.g. , 
"general") should be avoided; (2) words of un- 
certain word count (e.g. , "anyone") should be 
avoided; (3) "and" and "the" should be avoided 
(since they suffer severe reduction); (4) nearly 
all General American English phonemes should 
be represented, and higher than typical repre- 
sentation should be given to frequently mis- 
pronounced ones (l// r, s/); (5) the last 
sentence should be a question (to try to pre- 
vent a reduction of effort); (6) slightly trouble- 
some phonemic sequences should be included; 

(7) phonetic density should be slightly above 
average . 

Each speaker identified himself on the test tape by first 
speaking his assigned number in the phrase, "I am speaker 

num ber . " In addition to the test passage, each 

speaker read the sentences, "I will say beet," and "I will 
say boot." These sentences provided speech material similar 
to that used by Dixon (8) for spectrographic analysis. 
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Tape recording procedures . Stereophonic recording of sub- 
jects took place in the double room side of an Industrial Acoustics 
Corporation Model 1603-ACT two-room, sound isolated auditory 
test suite. Each subject was seated with his head positioned 
eight inches from two Sony F-121 dynamic microphones. A Sony 
TC-777-4 tape recorder was located in the other room of the test 
suite and was positioned so that the operator could view the 
subject through the connecting window. The test material (test 
passage and sentences) was printed in large type together with 
instructions and positioned for easy reading on a stand. The 
material was printed on the cards following a procedure described 
by Markel (32) to control for pauses. Each line of the copy read 
by the speaker either ended with a punctuation mark or with a 
natural grammatical pause. 

Subjects were asked to maintain their head position in 
relation to the microphones and to begin reading as soon as they 
saw a red light come on which was positioned in the observation 
window of the two-room suite. Subjects were asked to read all of 
the test material two times to provide further practice and a stable 
recording level. During the first reading the operator adjusted the 
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record levels on the two channels of the Sony 777-4 so that both 
V.U. meters peaked at approximately one hundred per cent with only 
occasional transient peaks above this level. The second complete 
reading of the material was recorded. It is believed that the number 
of practice readings helped to eliminate any reading difficulties 
and provided an additional check on subject eligibility. Recording 
was on new Scotch •' Dynarange" (201-12) high quality recording 
tape at a speed of 7 l/2 inches per second. Auditory monitoring 
using a Sony DR-3C headset and visual monitoring using the V.U. 
meters was maintained during the recording. 

Recording of data. A speaker data form was used for 
recording the personal and socioeconomic data from the Question- 
naire Form and the acoustic data from the taped speech samples: 

(1) Calculation of socioeconomic status score. The 
socioeconomic status score was calculated using a method developed 
by the Bureau of the Census, United States Department of Commerce. 
The score is the simple average of three numerical weights assigned 
to the occupation, educational level and the total family income for 
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the chief income recipient in the family. Weighting scores for 
the categories of occupation, educational level and family income 
are given in the report provided by the Bureau of the Census (51) . 

(2) Calculation of the articulatory product score. The 
articulatory product score is a semi -objective score developed and 
standardized by Guttman (15) at the Bell Telephone Laboratories. 

It represents overall merit of articulatory performance and is be- 
lieved by Guttman to contain those factors which listeners use 
to make global judgements of speaker quality. Calculation of 
the score uses the variables of whole word articulatory accuracy, 
total number of words in the test passage, optimal speaking 
duration on the test passage and actual speaking duration. The 
values representing these variables are used in the following 
formula which calculates the articulatory product (AP): 

AP = W/W Q x T/T q when T is less than 

or equal to T 0 or 

AP = W/W Q x T q /T when T is greater than 

T 

A o 

where W Q is the total number of words in the test passage, 
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W is the number of correctly articulated words , T 0 is the optimum 
speaking time for the test passage, and T is the measured speakinq 
time on the test passage. In summarizing the standardization study, 
Guttman (p. 338) says: 

The Articulatory Product correlates highly with 
subjective estimates of speech merit. It may be 
used by a single scorer rather than a panel of 
trained judges, is quantitative, provides a means 
of rapid evaluation, and does not require explicit 
analysis of articulation. 

The AP score for each speaker subject was calculated by 
listening to the recorded speech samples with a headset. Whole 
word articulatory accuracy (W) was judged on the basis of seg- 
mental phonemes. A word was regarded as having been articu- 
lated inaccurately if there was any segmental deviation from 
General American English (GAE) pronunciation as described by 
Kenyon and Knott (23) . A phonetic transcription of the test 
passage constituting the standard against which all speaker 
samples were judged is included in the Appendix. The values 
for W 0 and T 0 were 40 and 15 respectively as described in the 
Guttman study. Duration of the speech samples was measured 
to provide T using a Haydon Model K15140 electric laboratory 
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stop clock capable of measuring to 1/100 of a second. An inter- 
judge reliability check on duration timing produced a coefficient 
of correlation of .99. 

(3) Calculation of other speech variables. Total number 
of phonetic errors on the reading passage; total number of phonetic 
distortions , phonetic substitutions and phonetic omissions, and 
total number of misarticuiated phonemes was determined using 
procedures described above. 

Inter judge reliability. Studies cited previously by 
Morrison (36) , Sherman and Morrison (43), and Sherman and 
Cullinan (42) indicated that high interjudge reliability can usually 
be expected in calculation of speech articulation variables. 

Interjudge reliability on the speech variables was investigated for 
this research. Independent scoring on each variable was carried 
out on twenty-five speaker samples selected at random from the 
test tapes by one additional listener trained in speech pathology. 

In all cases the coefficient of correlation was found to be .95 

or above. 

Preparation of the test tape. All identification numbers used 
for speaker subjects were coUated and randomly selected to determine 
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the order of speakers on the test tape. The test tape was pre- 
pared by playing the speaker samples in the randomly determined 
order on the Sony 777-4 and re-recording using new Scotch 
"Dynarange" tape on a second high quality tape recorder, a 
Viking Model 88. A fifteen second interval of silence was pro- 
vided between speaker samples. During the re-recording process 
adjustments were made for any remaining disparity in the record 
intensity level of the speakers. In preparation of the test tape 
the two sentences, "I will say beet," and "I will say boot" were 
not re-recorded since this material was used only for the spectro- 

graphic analysis. 

Listening test procedures. Subjects who were used as 
speakers were called back to serve as listener subjects. The 
same data that was recorded for speaker subjects was recorded 
for listeners on a Listener Data Form (see Appendix) . Listeners 
listened and responded to the test tape in groups of fifteen in a 
quiet environment. The test tape containing the randomly arranged 
speech samples was played stereophonically on the Sony 777-4. 
The dual channel signal was fed from the Sony to the input jacks 
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of a high quality Fisher TX-100 integrated stereophonic master con- 
trol amplifier. Koss Model T-4 stereophonic connecting boxes were 
connected to the phone output jack of the Fisher to provide a suffi- 
cient number of output jacks for fifteen Sony DR-3C stereophonic 
headsets. The Sony headsets have a frequency response curve 

which is flat through 12 000 Hz. 

The intensity level of the signal at the headsets was ad- 
justed to produce the most comfortable listening level (MCL) as 
described by Ward (53) . The level was established by playing 
the test tape for listeners vath normal hearing who were not being 
used as s ub jects. The MCL was taken as the mean of their indi- 
vidual adjustments of the output gain of the Fisher with the play- 
back level controls on the tape recorder held constant. 

Instructions to listeners and data collection. Listeners were 
told that they were taking part in a study designed to show to what 
extent certain physical and psychological data can be determined from 
hearing samples of voice. Two experimentally irrelevant items were 
included in the listening task in an attempt to partially mask the major 
intent of the study. This procedure was similar to that used by Dickens 
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and Sawyer (7) . Data was collected on the Listener Response 
Form (see AppendixC). Speaker numbers were printed on the form 
in the order in which the speaker appeared on the test tape. 

The listener's own identification number was not written on 
his response form until after completion of the listening task. 

It was believed that the effect of any ability on the part of 
listeners to recognize their own voices on the tape would cancel 



itself out across the sample. Listener responses on the two 
variables of racial identification of the speaker and quality 
rating of the speaker were tabulated. 

Treatment of the data. The function of listener and 
speaker variables in racial identification and quality rating of 



speakers by listeners was determined using analysis of variance 



and co-variance and these results are reported in Chapter IV. 
The analysis was made on the Burroughs Bo 500 computer with 
applied multiple linear regression technique as developed by 
Bottenberg and Ward (3) . In describing this technique (vii) the 
authors state: 

We consider this the most direct and powerful 
approach to the effective formulation and reso- 
lution of a wide variety of research problems. 

Certain widely used procedures, e.g. , analysis 
of variance and analysis of covariance, are 
special cases of this general approach. 
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It is important to recognize that multiple 
regression analysis and multivariate corre- 
lational analysis are based upon different 
assumptions although many of the compu- 
tational aspects of the two procedures are 
similar. In general, the assumptions 
underlying the regression approach are 
less restrictive. Predictor variables 
in linear regression models, for example, 
are not assumed to come from multivariate 
normal distributions. 

Spectrographic a nalysis procedures. As described 
previously, a sample of the original recorded speech samples 
was chosen for intergroup s petrographic comparison. The 
sample used in spectrographic analysis was composed of 
ten male Negro speakers who had been identified by listeners 
at least 95% of the time as being Negro speakers. This group 
of ten was compared with ten male white speakers who had 
been identified by listeners at least 35% of the time as being 
white speakers. The sample of the original group of speakers 
consisted of all male speakers because of the considerable 
variation found in the acoustic spectra of men and women due 
to differences in fundamental frequencies of their voices and 
differences in the size of their vocal tract. 

(1) Description of speakers used in spectrographic analysis. 
Table 6 indicates the socioeconomic status scores of speakers used 
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in the spectrographic analysis. The difference in the mean scores 
between the Negro and white groups reflects the difference which 
exists in the sample as a whole and in the general population of 
Southern Jrited States. Table 7 describes the mean articulatory 
product scores of the Negro and white groups compared in 
spectrographic analysis. As discussed earlier, it is believed that 
the AP score can be taken as an indicator of overall speech merit. 

TABLE 6 

MEANS AND RANGES OF SOCIOECONOMIC STATUS SCORES 
OF SPEAKERS USED IN SPECTROGRAPHIC ANALYSIS 



Race of Subjects 


Mean SES Score 


Range 


Negro 

White 


30.80 

63.60 


14 - 46 
24 - 99 


All Subjects 


47.20 


14 - 99 




TABLE 7 




MEANS AND RANGES OF ARTICULATORY PRODUCT SCORES 
OF SPEAKERS USED IN SPECTROGRAPHIC ANALYSIS 


Race of Subject 


Mean AP Score 


Range 


Negro 


56.90 


35 - 80 


White 


80.30 


31 - 99 


All Subjects 


68. 60 


31 - 99 



a 










* 
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The difference in the group means on AP scores reported in Table 7 
is quite similar to the difference reported in Chapter IV for the 
whole sample used in the study. The coefficient of correlation 
reported in Chapter IV between identification of a speaker as being 
a Negro speaker and the AP score of that speaker would predict that 
speakers consistently identified as being Negro speakers would be 
found to have lower AP scores than speakers consistently identified 
as being white speakers. 

(2) Speech material used in spectrograms. The spectrographic 

comparison was carried out on the vowels (u) and (i) using the 

0 

variables of formant frequencies and relative formant amplitudes listed 
previously in the Experimental Design. These vowels were chosen 
because, according to Dixon (8), they are most stable and represent 
extremes physiologically and acoustically, It was believed that, 
because of this factor, these two vowels might provide the most 
potential for locating significant spectral differences. Shuy (44) 
has implied that one of the phonological correlates of socioeconomic 
stratification may be the presence, absence or substitution of nasal 
components. According to Dixon (8) , the (i) vowel with its high 
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F 2 and the ( u) vowel with its low F 2 are most useful clinically 
in the diagnosis of nasality. This gave added support to the choice 
of these vowels for this study. The recorded samples of the vowels 
to be studied were contained in the short sentences "I will say beet" 
and "I will say boot" recorded after the test reading passage by all 

subjects. 

(3) Equipment and recording procedure. The spectrographic 
analysis was carried out on a Kay Eloctric Company Sona-Graph 
Model 60 61 -A. This instrument is an audio frequency spectrum 
analyzer which produces permanent graphic recordings of any type 
of complex wave in the range of 85 Hz to 8000 Hz (22). The Sona- 
Graph is a commercial version of the unit originally developed by 
Bell Telephone Laboratories and described by Koenig, Dunn and 
Lacy (25). The test material containing the vowels to be studied 
was transferred from one channel of the original tape recorded on 
the Sony 777-4 to the recording turntable of the Sona -Graph. The 
Sona-Graph 600 ohm input was used as it is compatible with the line 
output impedence of the Sony 777-4. The gain on the tape recorder 
was held constant and the input attenuator of the Sona-Graph was 
adjusted to show a 0 peak on the V U. meter. Since measurements 
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were carried out on relative rather than absolute formant amplitudes , 
precise control of input signal inten. ity was not considered critical. 

(4) Reproducing spectrograms. The Sona-Graph is capable 
of producing two types of graphic display of recorded material: 

Type 1, a frequency by time display; and Type 2, a frequency by 
intensity display. All words were reproduced at V.U. meter peaking 
of -2 using the 45 -cycle filter to resolve the harmonics of the 
vowels being studied. A Type 1 and Type 2 display vrere made of 
both test words for every speaker in the sample. The intensity 
section used for spectral measurement was a frequency by intensity 
display of a five-millisecond period in the steady state portion of 
the vowel in each test word. It was believed that it was at this 
point that there would be minimal influence on the vowel spectrum 
from surrounding consonants. The steady-state portion was 
established visually using a procedure described by Dixon (8) 
and selected from the frequency by time display before removing 
that display from the display drum. In making the frequency by 
intensity displays, the 500-cycle tone built into the Sona-Graph 
was used after each test word to provide a calibration line on the 
Sona-Graph paper every 500-cycles across the 8000-cycle frequency 



range . 
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(5) Formant frequency measurements. The 8000 -cycle 
frequency range of the sound spectrograph was displayed in a 
vertical distance of four inches on the spectrogram. The cali- 
bration tone provided a line on each spectrogram every 5 00 -cycles 
from 500-cycles through 8000-cycles. The frequency response of 
the instrument was found to be linear across the frequency range. 
Discrete frequencies of vowel formants were determined through 
use of a transparent plastic overlay template. This template 

was constructed by marking the 500-cycle calibration lines on the 
plastic and interpolating between the lines. This produced a fre- 
quency scale with mark points every 62 -cycles and with add ^iial 
interpolations possible between markings. The first 500-cycle 
line on the template was aligned with the first 500-cycle line on 
each amplitude section and formant frequency measurements were 
made from this reference using the peak of the strongest resonance 
within the formant band as the formant frequency. 

(6) Relative formant amplitude measurements. Following 

a procedure described by Dixon (8) , a transparent plastic intensity 
scale template was constructed using 1/16 of an inch to represent 
1 dB of attenuation. Since the interest was in relative formant 
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frequency intensities, the O dB line on the template was set in 
each case on the F ; peak and readings of relative amplitudes were 
made from this reference using the variables of - F 2 (difference) , 

F2 - Fg (difference) and F^ -Fg (difference) stated in the Experi- 
mental Design. 

(7) Statistical treatment of the data. Intergroup comparison 
was carried out on the means described using the data derived by 
above procedures and results are reported in Chapter IV and t-te’sts 
were used to determine statistically significant mean differences. 

4 

The null hypothesis stated for this part of the study was that no 
significant differences would be found in intergroup comparison 
of mean spectrographic values. Alpha was set at .05. 

III. PERFORMANCE OF SUBJECT GROUPS 
ON SPEAKING AND LISTENING TASKS 

The articulatory product scores and scores obtained by 
subjects on the following speech variables were tallied: 

1. Number of phonetic errors; 

2. Number of phonetic distortions; 

3. Number of phonetic substitution ;; 

4. Number of phonetic omissions; 

5. Number of misarticulated phonemes. 
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The means and ranges for Negro and white, male and female 
subjects were calculated. This procedure was also carried 
out on the responses made by subjects on the listening tasks. 
The group means on all scores were examined for comparison 
with previous studies and to determine group differences 
which would necessitate co-variance control in the regression 
analysis. The group mean scores and ranges are reported in 
this section. 



Articulatory product scores of subjects . Table 8 describes 
the means and ranges of articulatory product scores of Negro and 
white subjects with an indication of the relative scores obtained by 
male and female subjects within each group. The AP score described 
previously is a semi-objective index of merit of articulatory per- 
formance which combines measures of whole word articulatory 
accuracy and rate of articulation. The measures and calculation 
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can be carried out reliably by one judge. The resulting product 
has been found to correlate highly with subjective ratings of 
speech merit offered by groups of judges (15) . The scale of AP 
scores is 0-100 with the latter representing the best possible 
performance . 

TABLE 8 

ARTICULATORY PRODUCT SCORES OF SUBJECTS 



Race of Subjects 


Mean AP Score 


Range 


Negro 


57.34 


32 - 83 


male 


56.26 


35 - 80 


female 


58.66 


32 - 83 


White 


76.79 


49 - 95 


male 


76.22 


49 - 95 


female 


78.15 


50 - 95 


All Subjects 


66.74 


32 - 95 
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It can be noted in Table 8 that the mean score for all 
subjects was 66.74 and that the ranges and means for Negroes 
were consistently below that of whites. Within both groups the 
female speakers were found to score somewhat higher on the 
scale than the males with a difference of 2.40 for Negroes and 
1.93 for whites. 

Speech proficiency ratings received by speakers. 
Perceptions of the overall quality of the tape recorded speech 
samples were marked by listeners on a three-point scale: 

1 good speaker, 2 average speaker, 3 poor speaker (see Appen- 
dix C.) . The means and ranges of speech proficiency ratings 
received by speakers are recorded in Table 9 on a scale from 
1.00 to 3.00. Oh this scale the good speaker (best possible 
rating) is represented by 1.00, the average speaker by 2.00 and 
the poor speaker (worst possible rating) by 3.00. 

Table 9 shows that, in ratings received from all subjects, 
Negro speakers received consistently lower ratings . The same 
relative difference exists when ratings offered by Negro listeners 
and by white listeners are considered separately. In all cases 
female speakers are consistently judged to be superior speakers. 
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This finding of relative intergroup differences is the same as 
that reported in Table 8 concerning the articulatory product scores. 

TABLE 9 

MEAN SPEECH PROFICIENCY RATINGS RECEIVED BY SPEAKERS 



Race of Speakers 


Mean Proficiency 
Rating Received 


Range 




Received From All Listeners 




Negro 


2.27 


1.31 - 2.91 


male 


2.36 


1.38 - 2.88 


female 


2.16 


1.31 - 2.91 


White 


1.80 


1.26 - 2.70 


male 


1.87 


1.26 - 2.67 


female 


1.80 


1.33 - 2.70 


All Speakers 


2.04 


1.26 - 2.91 




Received From Negro Listeners 




Negro 


2.22 


1.37 - 2.86 


male 


2.30 


1.67 - 2.84 


female 


2.20 


1.37 - 2.86 


White 


1.79 


1.37 - 2.56 


male 


1.83 


1.37 - 2.51 


female 


1.69 


1.44 - 2.56 


All Speakers 


2.01 


1.37 - 2.86 
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TABLE 9 Continued — 

Received From White Listeners 



Negro 


2.32 


1.21 - 2.95 


male 


2.42 


1.23 - 2.95 


female 


2.21 


1.26 - 2.95 


White 


1.79 


1.12 - 2.84 


male 


1.87 


1.12 - 2.51 


female 


1.63 


1.16 - 2.84 


All Speakers 


2.06 


1.12 - 2.95 



NOTE: The scale of speech proficiency in this and in 
succeeding tables is 1.00 (best possible rating) to 3.00 (lowest 
possible rating) . 



Speech proficiency ratings given to speakers by l isteners . 
The previous table described ratings received by speakers and 
intergroup speaker differences in these ratings were pointed out 
as they were received from Negro and white listeners. Table 10, 
on the other hand, indicates ratings given to speakers by listeners. 
The designations in the far left column of the previous table referred 




to speakers while the designations of race and sex in Table 1C 



refer to listeners. 



TABLE 10 

MEAN SPEECH PROFICIENCY RATINGS GIVEN TO SPEAKERS 

BY LISTENERS 



Mean Proficiency 

Race of Listeners Ratings Given Range 



Ratings Given to All Speakers 



Negro 


2.04 


1.55 - 2.52 


male 


2.03 


l.SS - 2.43 


female 


2.04 


1.57 - 2.52 


White 


2.11 


1.90 - 2.49 


male 


2.15 


1.91 - 2.49 


female 


2.01 


1.90 - 2.33 


All Listeners 


2.07 


1.57 - 2.52 


Ratings 


Given to Sneakers Perceived to Be Negro 


Negro 


2.25 


1.65 - 2.71 


male 


2.22 


1.65 - 2.71 


female 


2.29 


1.67 - 2.62 


White 


2.44 


1.89 - 2.88 


male 


2.47 


1.89 - 2.88 


female 


2.35 


2.02 - 2.82 


All Listeners 


2.34 


1.67 - 2.88 
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TABLE 10 Continued — 



Ratinqs Given to 


Speakers Barceived to be White 


Negro 


1.83 


1.24 - 2.83 


male 


1.83 


1.40 - 2.24 


female 


1.74 


1.24 - 2.83 


White 


1.82 


1.22 - 2.70 


male 


1.83 


1.41 - 2.13 


female 


1.78 


1.22 - 2.70 


All Listeners 


1.82 


1.22 - 2.83 



An important feature to be noted in Table 10 is that there 
is homogeniety in the way in which Negroes and whites, males 
and females rate speakers. This congruence of intergroup ratings 



is the same when speakers are perceived to be Negro and when 
speakers are perceived to be white. The difference between the 
mean of 2.34 given to speakers perceived to be Negro and the mean 
of 1.82 given to speakers perceived to be white is consistent with 
in+omrmin Rneech nroficiencv differences noted in the AP score, 
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the intergroup speech proficiency 
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Table 8, and ratings received by speakers. Table 9. 

Subject self-rating of speech proficiency. Table 11 offers 
means of the self-rating made by Negro and white , male and female 
subjects. While Negroes can be seen to give themselves poorer 
ratings than those made by white subjects, this difference is 
consistent with intergroup speech proficiency differences indicated 
by the findings reported previously in this chapter. Some tendency 
can be seen in this table for male subjects of both races to rate 
themselves somewhat better than those self-ratings made by female 
subjects. This tendency appears to be slightly stronger among white 
males. 

TABLE 11 

SUBJECT SELF-RATING OF SPEECH PROFICIENCY 
Race of Subjects Mean Self-Rating 



Negro 


2.21 


male 


2.15 


female 


2.29 


White 


1.93 


male 


1.74 


female 


2.13 


All Subjects 


2.07 


male 


1.94 


female 


2.21 
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Listener accuracy in identification of the race of speakers. 

Table 12 indicates the mean percentage of accurate racial identi- 
fications made by Negro and white , male and female listeners . 
There is a slight tendency for white listeners to be somewhat 
more accurate than Negro listeners in racial identification ability 
with the exception of Negro male and white male identification 
of white male and female speakers. Generally, males and females 
tend to be equally accurate in racial identification of speakers. 
White females had the best mean performance in identifying white 
speakers and Negro females showed the least accuracy in their 
identification of whites. These means, 87.24 and 80.12, repre- 
sent the overall range of accuracy on the identification task. 
Generally, listeners are seen to be equally accurate in identifying 
both Negroes and whites. 
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TABLE 12 



LISTENER ACCURACY IN IDENTIFICATION OF THE 
RACE OF SPEAKERS 



Race of Listeners 


Mean Percentage 
of Accuracy 


Range 


Accuracy in 


Identification of All Speakers 


Negro 


82.38 


59.30 - 95.60 


male 


82.15 


59.30 - 95.60 


female 


83.43 


62.90 -93.40 


White 


85.36 


60.40 - 96.70 


male 


85.06 


60.40 - 96.70 


female 


84.50 


76.90 - 93.40 


All Listeners 


83.82 


59.30 - 96.70 


Accuracy in Identification of Negro 


Speakers 


Negro 


82.34 


27.60 - 100 


male 


79.65 


27.60 - 100 


female 


81.77 


70.20 - 100 


White 


86.03 


65.90 - 100 


male 


86.55 

^ S i 


65.90 - 100 


female 


84.90 


65.90 - 95.70 


All Listeners 


84.12 


27.60 - 100 
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TABLE 12 Continued — 

Accuracy in Identification of White Speakers 


Negro 


82.30 


22.70 - 95.40 


male 


84.62 


65.90 - 95.40 


female 


80.12 


22.70 - 95.40 


White 


84.56 


54.40 - 95.40 


male 


83.36 


54.40 - 95.40 


female 


87.24 


72.70 - 95.40 


All Listeners 


83.39 


22.70 - 95.40 



Speaking duration on the test passage. Although duration 
on the test passage was one of the components used in calculation 
of the articulatory product score, means we re calculated on this 
factor as an independent variable in Table 13. A consistent difference 
in performance can be seen between Negro and white speakers and 
between males and females. The overall range for duration is repre- 
sented by 14.40 seconds for white males and 18.24 seconds for 



Negro females. 
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TABLE 13 

SPEAKING DURATION ON THE TEST PASSAGE 



Race of Speakers 


Mean Duration 
in Seconds 


Range 


Negro 


17.32 


12.32 


- 23.81 


male 


16.58 


12.38 


- 23.17 


female 


18.24 


12.32 


- 23.81 


White 


14.70 


11.30 


- 24.58 


male 


14.40 


11.30 


- 24.58 


female 


15.44 


13.82 


- 19.67 


All Speakers 


16.05 


11.30 


- 24.58 



NOTE: Optimal speaking duration on the test passage 
was 15.00 seconds. 



Words misarticulated by speakers on the test passage. 

The total number of words misarticulated was also a component 
in the articulatory product score# but independent means are 
provided for this variable in Table 14. It can be noted that# as 
in the case of previous variables# there is a consistent difference 
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between the mean number of words misarticulated by Negro speakers 
and the mean number of words misarticulated by white speakers. 
Negro speakers misarticulated more words than white speakers and 
Negro females performed better than Negro males. The reverse was 
true in the case of white speakers with males misarticulating slightly 
fewer words than females. 

TABLE 14 

NUMBER OF WORDS MISARTICULATED BY SPEAKERS 
ON THE TEST PASSAGE 



Race of Speakers 


Mean Number of 
Words Misarticulated 


Range 


Negro 


12.82 


5-23 


male 


14.69 


6-22 


female 


11.24 


5-23 


White 


5.68 


0-17 


male 


5.65 


0-14 


female 


5.77 


2-17 


All Speakers 


9.37 


0-23 



NOTE: The reading passage contained a total of 40 words. 
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Phonetic e rrors by speakers on the test passage. Consistent 
differences between the performance of Negro and white speakers can 
be seen on the three types of phonetic error reported in Table 15 
with white speakers having fewer phonetic errors. The difference 
between the means for Negro and white speakers on phonetic sub- 
stitutions was 3.16, the difference on omissions was 4.94 and the 
difference on distortions was 1.66. The same relationship exists 
between the performance of male and female speakers as was re- 
ported in the previous table. 

TABLE 15 

NUMBER OF PHONETIC ERRORS BY SPEAKERS 
ON THE TEST PASSAGE 



Race of 
Speakers 


Mean No. 
Substitutions 


Mean No. 
Omissions 


Mean No. 
Distortions 


Total 


Negro 


6.57 


6.80 


2.93 


16.31 


male 


7.11 


6.89 


3.19 


17.19 


female 


5.90 


6.72 


2.62 


15.24 


White 


3.34 


1.86 


1.27 


6.47 


male 


3.13 


1.77 


1.32 


6.22 


female 


3.85 


2.08 


1.15 


7.08 


All Speakers 5.01 


4.41 


2.13 


11.56 
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Phonemes misarticulated by speakers on the test passage. 
The mean number of misarticulated phonemes reported in Table 16 
was obtained by counting how many of the 34 phonemes of standard 
American English were judged to be incorrect on the test passage. 
The count did not consider the number of times a given phoneme 
may have been misarticulated. It can be seen that the same con- 
sistency of differences between Negroes and whites, males and 
females is indicated on this table as was noted in previous tables. 

TABLE 16 

NUMBER OF PHONEMES MISARTICULATED BY SPEAKERS 

ON THE TEST PASSAGE 



Race of Speakers 


Mean Number 
Misarticulated Phonemes 


Range 


Negro 


9.42 


1 - 17 


male 


9.92 


1 - 17 


female 


8.80 


2-17 


White 


5.00 


0-15 


male 


4.87 


0-14 


female 


5.30 


2-15 


All Speakers 


7.28 


0-17 



NOTE: The test passage contained 34 phonemes of standard 



American English. 
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Summary of subject performance. 

1. Negro subjects were found to have consistently lower articula- 
tory product scores than white subjects; 

2. Negro speakers were consistently rated lower in speech quality 
by both Negro and white listeners; 

3. The relative difference in quality ratings received by Negro and 
white speakers was the same when ratings offered by Negro 
listeners and by white listeners were considered separately; 

4. On both the articulatory product score and listener ratings, 
female speakers of both races were found to have ratings superior 
to male speakers of both races; 

5. There was homogeniety in the way in which Negroes and whites, 
males and females rated speakers; 

6. The homogeniety of intergroup ratings was the same when speakers 
were perceived to be Negro and when speakers were perceived to 
be white; 

7. Negro speakers rated themselves lower in speech quality than 
white speakers rated themselves but the difference was con- 
sistent with intergroup quality score differences; 

8. Male subjects rated themselves better than female subjects 



rated themselves; 
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9 . Listeners were found to have a range of 80. 12% to 87.24% 
accuracy in the identification of the race of speakers/ 

10. Listeners were found to be equally accurate in identifying both 
Negro end white speakers j 

1 1 . Negro speakers were found to be slower speakers than white 
speakers and male subjects were consistently slower than 
female subjects; 

12. Negro speakers were found to mis articulate more words on the 
test passage than white speakers; 

13. Negro subjects were found to have more phonetic errors than 

white subjects; 

14. Negro subjects were found to misarticulate more of the 34 phonemes 
included in the test passage than white subjects. 

Discussion of subject performance . The study by Dickens and 
Sawyer (7) discussed in Chapter II provides some points of interesting 
comparison with the subject scores reported here. For instance/ their 
investigation found that listeners were approximately 70% accurate in 
identifying the race of speaker. This is in contrast to the 83.82% mean 

listener accuracy found in this study. 

Dickens and Sawyer reported having found that white observers 
were more accurate in racial identification than Negro observers and 
that there was greater accuracy shown by observers in identifying 
speakers of their own race. While Table 12 of this study reports 
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a slight difference in identification accuracy in favor of white listeners , 
the difference is of very low magnitude. In contrast with the finding of 
Dickens and Sawyer, this study did not show greater accuracy on the 
part of listeners in identifying speakers of their own race. Dickens 
and Sawyer found greater accuracy in ide u ifying male speakers . 

This finding was not confirmed by the present study. 

On the matter of speaker quality, Dickens and Sawyer reported 
that the combined judgement of all speakers rated Negro females and 
white males as highest in vocal quality. This contrasts with the study 
reported here in which Table 9 indicates that white males and white 
females were judged to be the best speakers. The findings of the 
present study regarding amount of bias apparent in quality rating of 
speakers agrees with the finding in the study by Dickens and Sawyer 
in which no bias was found. 

One explanation for the contrast in findings between the study 
reported here and that of Dickens and Sawyer is that the latter study 
used all university students as subjects. Apparently, there was no 
attempt to draw a sample which would be somewhat similar to a 
population on the socioeconomic status variable. 



Some findings of the study reported here tend to agree 
with those presented by Larson and Larson (31) . In findings similar 
to those reported in this study, Larson and Larson found that listeners 
tend lo favor white pronunciations and are able to distinguish between 
white and Negro speakers wi:h a high degree of accuracy. In the 
Larson and Larson study, however, there was apparently no attempt 
to obtain a more objective indicator of speaker quality against which 
to compare the listener ratings. An important feature of the study 
reported here is that, while it was found that listeners rate white 
speech as better than Negro speech, this difference in rating is 
consistent with differences indicated by the semi-objective AP scores. 

The current study confirms the finding reported by Larson and 
Larson that Negro listeners tend to agree with white listeners in judging 
the vocal quality of both Negro and white speakers. This may mean, 
as Larson and Larson imply, that Negroes implicitly accept the white 
standard of speech performance as more valuable. In the present 
study there was no consistent intergroup differences in the way in 
which Negro and white listeners rated Negro and white speakers. 

Edmonds (9) found that socioeconomic status had a greater 
relationship to verbal ability than the sex of the speaker. This finding 
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was confirmed by the present study. The coefficient of correlation 
between the socioeconomic status score and the AP score was . 48 
while the relationship of AP to sex was .08. 

Two findings of the current study on subject performance 
contrast with those reported by Stroud (47) . It was found by Stroud 
that listeners were able to correctly identify the race of speakers 
93% of the time. This contrasts with the 84% accuracy attained by 
speakers in this study. Stroud also reported that there was a positive 
correlation between socioeconomic status and accuracy in identifying 
the race of speakers. The correlation between these factors in this 
study was non- significant at .19. 

The findings of this study on subject performance indicate that., 
while the phonological patterns of Negro speakers are identified by 
listeners with considerable accuracy/ there is little difference between 
the two groups in the standards of speech performance applied to 
speakers. An implication of this latter finding is that white and Negro 
listeners can be expected to agree on the quality of performance of 
speakers. This may mean that if habilitation programs can be structured 
which are directed at raising the articulatory product scores of Negro 
speakers# their quality ratings by both Negro and white listeners will 




be improved. 



CHAPTER IV 



RESULTS 

The purpose of the experimental design and procedures 
described in the previous chapter was to answer the questions 
posed for this study in Chapter II. These questions all relate to 
two area of inquiry: 1. informational content and social dialect 
analysis directed at specification of variables which may function 
significantly in the racial identification and quality rating of 
speakers and, 2. a spectrographic analysis directed at specifi- 
cation of differences which may exist in the vocal resonance 
characteristics of speakers most consistently identified as Negroes 
and speakers most consistently identified as whites. 

The independent variables selected for information content 
and social dialect analysis are listed here for review: 

1. Race; 

2 . Age ; 

3 . Sex; 

4. Socioeconomic status score; 

5. Articulatory product score; 

a. speaking time duration, 

b . number of misarticulated words , 
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6. Articulation errors; 

a. number of phonetic distortions, 

b. number of phonetic omissions, 

c. number of phonetic substitutions, 

7. Total number of misarticulated phonemes; 

8. Self-rating of speech proficiency. 

The spectrographic analysis was independent of infor- 
mational content and social dialect analysis and was carried out 
on a sample of the sample used in that investigation. Variables 
chosen for intergroup comparison of resonance characteristics 
included the following: 

1. Formant one frequency of the (i) and (u) vowels; 

2. Formant two frequency of the (i) and (u) vowels; 

3. Formant three frequency of the (i) and (u) vowels; 

4. Formant one/formant two relative amplitudes of 
the (i) and (u) vowels; 

5. Formant two/formant three relative amplitudes of 
the (i) and (u) vowels; 

6. Formant one/formant three relative amplitudes of 
the (i) and (u) vowels. 




81 



Ninety-one subjects were chosen to provide the speech data 
and eighty-six of these individuals served as subjects in the listening 
task. Subjects were chosen to provide a sample reasonable representa- 
tive of the distribution of socioeconomic status scores in Southern 
United States. The experimental procedures described in Chapter III 

were carried out on this sample. 

This chapter will describe the results of the analyses in two 
sections: 1. results of the regression analysis used to answer the 
questions posed in Chapter II relative to the function of the independent 
in the dependent variables and, 2. results of the spectrographic analysis 
and statistical comparison of means used to answer the question posed 
in Chapter II concerning possible intergroup differences on spectrographic 
variables. 

I. RESULTS OF REGRESSION ANALYSIS 
The function of variables in predicting racial identification and 

rating of speakers. A regression analysis was carried out on the variables 
discussed previously. The purpose of the analysis was to answer the 
questions posed for this study regarding the function of each of the inde- 
pendent variables in determining how speakers are identified as to race 
and how they are rated in quality by listeners. The independent variables 
were used to generate full models for regression analysis using the following 




criterion variables: 
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1. The number of times speakers were identified as Negro 

speakers; 

2. The number of times speakers were identified as white 
speakers; 

3. The quality rating received by all speakers; 

4. The quality rating received by Negro speakers; and 

5. The quality rating received by white speakers. 

The zero order correlations between the independent and de- 
pendent variables were examined for potential significance. From 
this examination variables were chosen for the full models and the 
systematically restricted models used in the regression analysis. 
Table 17 indicates the zero order correlations which were found to 
exist between the independent variables and the number of times 

speakers were identified as being Negro speakers and the number of 
times °neakers were identified as being white speakers. Table 18 
indicates the zero order correlations which were found to exist 

between the independent variables and quality ratings received by 
all speakers from Negro and white listeners. Table 19 indicates the 
zero order correlations which were found to exist between the inde- 
pendent variables and quality ratings received by Negro and white 



speakers. 
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TABLE 17 

ZERO ORDER CORRELATIONS BETWEEN INDEPENDENT VARIABLES 
AND THE RACIAL IDENTIFICATION OF SPEAKERS 



Variable 




Correlation 


With Percentage Speaker With Percentage Speaker 
Perceived as Negro Perceived as White 


(X^) Race - Negro 


0.8469 


-0,8603 


(X 2 ) Race - White 


-0,8469 


0.8603 


(x 3 ) Age 


0.2081 


-0.2016 


(X 4 ) Sex - Male 


-0.1242 


0.0985 


(X 5 ) Sex - Female 


0.1242 


-0.0985 


(X^) SES Score 


-0.4841 


0.4796 


(Xy) AP Score 


-0.6566 


0.6723 


(X ) Rating by Listeners 
8 


0. 6239 


-0.6322 


(Xg) by Negroes 


0.6257 


-0.6243 


(X 10 ) by Whites 


0.6648 


-0.6645 


(Xjj) Ratings given toSpeakers-0. 2675 


0.2318 


(X-^) to Negroes 


-0.4051 


0.3824 


(X 23 ) to Whites 


-0.0570 


0.0325 


(X.^) Self-rating 


0.3048 


0.2750 


(X J Duration 
15' 


0.2503 


-0.2663 


(X 16 ) Number of Misarticu 
lated words 


0.7158 


-0.7306 



I 
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TABLE 17 Continued — 



(Xi 7 ) 


Number of Phonetic 
Errors 


0.6729 


-0.6818 


< X 18> 


Number of Substitutions 


0.6386 


-0.6311 


X 

H* 

CO 


Number of Omissions 


0.5666 


-0.5760 


(X 20 ) 


Number of Distortions 


0.5214 


-0.5545 


< X 21> 


Number of Mis articu- 
lated Phonemes 


0.6418 


-0.6516 
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TABLE 18 

ZERO ORDER CORRELATIONS BETWEEN INDEPENDENT 
VARIABLES AND THE QUALITY RATING RECEIVED BY 

ALL SPEAKERS 



Correlation with Rating 

Variable Received 



From Negro From White 



By All Speakers Listeners Listeners 



(X^) Race - Negro 


0.4888 


0.4909 


0.5114 


(X 2 ) Race - White 


-0.4888 


-0.4909 


-0.5114 


(X 3 ) Age 


-0.1084 


-0.1144 


-0.0952 


(X^) Sex - Male 


0.1267 


0.1084 


0.1174 


(X,-) Sex - Female 


-0.1267 


-0.1084 


-0.1174 


(X ) SES Score 
6 


-0.6315 


-0.6242 


-0.6497 


(X^) AP Score 


-0.7303 


-0.7434 


-0.7220 


(Xg) Ratings by Listeners 


-0.2394 


-0.3102 


-0.2341 


(Xg) by Negroes 


-0.2613 


-0.3150 


-0.2894 


(X 1Q ) by Whites 


-0.0959 


-0.1387 


-0.0677 


(X^) Self-rating 


0.2965 


0.2962 


0.3123 


(Xiq) Duration 


0.2176 


0.2414 


0.1896 


(X 15 ) Total Misarticulated 


0.7553 


0.7726 


0.7621 



Words 




TABLE 18 Continued — 
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(X 17 ) 


Total Phonetic Errors 


0.7214 


0.7414 


0.7236 


(X 18 ) 


Total Substitutions 


0.7170 


0.7356 


0.7245 


(X19) 


Total Omissions 


0.6146 


0.6419 


0.6247 


«20> 


Total Distortions 


0.4501 


0.4323 


0.4144 


(X 21 ) 


Total Misarticulated 


0.6992 


0.7113 


0.7062 



Phonemes 
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TABLE 19 



ZERO ORDER CORRELATIONS BETWEEN INDEPENDENT 
VARIABLES AND THE QUALITY RATING RECEIVED BY 
NEGRO AND WHITE SPEAKERS 



Correlations with Correlations with 

„ ... Rating Received Rating Received 

Variable 



By Negro Speakers By White Speakers 



(Xj) Race - Negro 


0.0000 


0.0000 


(X 2 ) Race - White 


0.0000 


0.0000 


(X 3 ) Age 


-0.4378 


-0.3496 


(X4) Sex - Male 


0.2166 


0.2653 


(Xg) Sex - Female 


-0.2166 


-0.2653 


(X 8 ) SES Score 


-0.6808 


-0.4148 


(X 7 ) AP Score 


-0.5670 


-0.7187 


(X 14 ) Self-rating 


-0.3056 


0.3048 


(X^) Duration 


0.0035 


0.0368 


(X J Total Misarticulated Words 
16 


0.6485 


0.7021 


(X 17 ) Total Phonetic Errors 


0.6322 


0.6384 


(X l8 ) Total Substitutions 


0.6309 


0.6420 


(X^g) Total Omissions 


0.5721 


0.3947 


(Xpg) Total Distortions 


0.0945 


0.6644 


(X 21 ) Total Misarticulated 


0.5921 


0.6349 



Phonemes 
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The following independent variables were used in the 
initial full predictor model generated for regression analysis: 



1 . 


Race of speaker (Negro or white); 


2. 


Age of speaker; 


3. 


Sex of speaker; 


4. 


Socioeconomic status score; 


5, 


Articulatory product score; 


6. 


Ratings received by speakers; 


7. 


Self-ratings made by speakers; 


8. 


Racial identification accuracy; 


9. 


Speaking duration; 


10. 


Total misarticulated words; 


11. 


Total phonetic errors; 


12. 


Total phonetic substitutions; 


13. 


Total phonetic emissions; 


14. 


Total phonetic distortions; 


15. 


Total misarticulated phonemes. 



The full model was sytematically restricted by removal 
of variables and analyzed with the criterion variables to test 
hypotheses regarding the function of the independent in the 

* 

o 

ERIC 
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dependent variables. The F -rati.es resulting from the different 
combinations of full and restricted models were checked by 
comparison with the theoretical F distribution. The regression 
analysis was carried out on the Burroughs B5500 electronic data 
processing system using a program based upon the work of Botten- 
berg and Ward (3) which was described in Chapter III. 

Table 20 lists the five criterion variables developed 
from the questions posed for this study in Chapter II. The 
predictor variables are those found to produce significant 
F-ratios when the full models were restricted by these 
variables and run against the criterion variables in regression 
analysis. The five criterion variables listed previously 
were found to produce three predictor variables whose 
F-ratios indicate that they are significant factors in the 
racial identification and rating of Negro and white speakers 
by Negro and white listeners. Analysis of co-variance 
was used to control for intergroup mean difference on the 
socioecomomic status scores and the articulatory product 
scores. The significance of the predictor variables in 
predicting the dependent variables listed previously was 




TABLE 20 




TABLE 20 CONTINUED 
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maintained under this control. 

Table 20 indicates that the number of phonetic 
distortions made by the speaker is important in determining 
whether he will be perceived as a Negro speaker or whether 
he will be perceived as a white speaker. The correlation 
of .52 between criterion variable y, and phonetic distortion 
indicates that the greater the number of distortions the 
more likely it is that a given speaker will be identified as a Negro 
speaker. The -.55 correlation between criterion variable y^ and 
phonetic distortions indicates that the fewer the number of 
distortions the more likely it is that a given speaker will be 
identified as a white speaker. 

The quality ratings received by both Negro and white 
speakers ( yg, y/[, y§) were significantly predicted by the 
socioeconomic status scores at the .01 level and by the 
articulatory product scores at the .01 level of confidence. 

The negative correlations would make it appear that, as SES 
score and AP score go down, quality rating goes up. The 
reverse of this is the case, however, since the scale of 
quality rating in the study used number 1 for best speakers and 
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number 3 for poorest speakers. High ratings on the scale means 
speakers were judged to be average or poor speakers. The -.73 
correlation between the AP score and quality ratings received 
by all speakers indicates, for instance, that as these scores 
go down, speakers become rated as poorer speakers. The same 
relationship exists in the case of the SES scores as indicated by 
the -.63 correlation with ratings received. 

Number of phonetic distortions, SES score, and AP score 
were the only independent variables found to be significant in the 
regression analysis. The remaining variables included in the questions 
in Chapter II and listed previously in this chapter were not found 
to function significantly in the racial identification and rating of 
Negro and white speakers by Negro and white listeners. F-ratios 
for all variables are reported in Appendix D. 
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II. RESULTS OF SPECTROGRAPHIC ANALYSIS 

A major question posed for this study was whether significant 

mean differences could be found to exist on selected acoustic 

variables between a group of speakers consistently judged by 

listeners to be Negro and a group of speakers consistently judged 

by listeners to be white. Table 21 indicates means and ranges of 

the formant frequencies of the (i) vowel. Although the mean F„ 

z 

frequency and the mean F3 frequency for Negroes is below tha? of 
whites, this relationship does not exist for the Fj frequency. No 
consistent or significant intergroup differences could be found on 
this variable, using a t-test (see Appendix E). 

TABLE 21 

MEANS AND RANGES OF THE FORMANT FREQUENCIES OF 

THE (i) VOWEL 

Mean 

Race of Speakers Formant Frequency Range 

Negro Fj - 260 Hz 187 Hz - 343 Hz 

F 2 - 2222 Hz 1968 Hz - 2437 Hz 

F 3 - 2831 Hz 2500 Hz - 3250 Hz 



3 
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TABLE 21 Continued — 



White 



F - 319 Ea 
F 2 - 2076 Hz 



218 Hz - 437 Hz 



1795 Hz - 2500 Hz 



F 3 - 2738 H 



2250 Hz - 3312 Hz 



All Speakers 



Fj - 289 Hz 



187 Hz - 437 Hz 



F 2 - 2149 Hz 



1795 Hz - 2500 Hz 



F - 2784 Hz 
3 



2250 Hz - 3312 Hz 



Table 22 reports the means and ranges of relative formant 
amplitudes for the (i) vowel. In comparing the means on the relative 
amplitudes of Negro subjects with those of white subjects, greater 
formant amplitude differences can be noted for the Negroes. Although 
greater formant amplitude differences can be noted for the group of 
Negro subjects, the differences between the means of the two groups 
were not found to be significant using a t-test (see Appendix E) . 
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TABLE 22 

MEANS AND RANGES OF RELATIVE FORMANT AMPLITUDES 

FOR THE (i) VOWEL 



Race of Speakers 


Mean 

Relative Amplitude 
in Decibels 


Range 


Negro 


F]/F 2 - 3.80 


1 - 7 




F 2 / F 3 - 2.20 


0-4 




F 1 /F 3 - 3.00 


0-6 


White 


F l/F 2 - 3.00 


1 - 5 




f 2 / f s - 2.00 


0-4 




F l / F 3 - 2.00 


0-6 


All Speakers 


f x / f 2 - 3.40 


1 - 7 




O 
«— 1 
• 

CM 

1 

CO 

CM 

Cm 


0-4 




r i/ p 3 “ 2 * 50 


0-6 



0 



o 
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Table 23 reports the means and ranges of the formant fre- 
quencies of the (u) vowel. No consistent relationship could be 
found in the data for the two groups and the differences between 

the means were not found to be significant using a t-test (see 
Appendix E) . 

TABLE 23 

MEANS AND RANGES OF THE FORMANT FREQUENCIES 

OF THE (u) VOWEL 



Race of Speakers 


Mean 

Formant Frequency 


Range 


Negro 


F x - 284 Hz 


156 Hz - 417 Hz 




F 2 - 1033 Hz 


709 Hz - 1187 Hz 




F 3 - 2326 Hz 


1935 Hz - 2562 Hz 


White 


? x - 400 Hz 


218 Hz - 468 Hz 




F 2 - 1353 Hz 


1125 Hz - 1937 Hz 




F - 2212 Hz 


1968 Hz - 2562 Hz 



o 
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TABLE 23 Continued — 



All Speakers 



- 342 Hz 
F 2 - 1983 Hz 
F - 2269 Hz 

w 



156 Hz - 468 Hz 
709 Hz - 1937 Hz 
1935 Hz - 2562 Hz 



Table 24 reports the means and ranges of relative formant 
amplitudes for the (u) vowel. Although the same consistently lower 
relative formant amplitudes were found for Negro speakers on the 
(u) vowel as were reported for the (i) vowel, the differences 
between the means for the two groups were not found to be signifi- 
cant. using a t-test (see Appendix E). 

TABLE 24 



MEANS AND RANGES OF RELATIVE FORMANT AMPLITUDES 

FOR THE (u) VOWEL 


Race of Speakers 


Mean 

Relative Amplitude 
in Decibels 


Range 


Negro 


F 1 /F 2 - 5.40 


3-7 




F 2 /F 3 - 4.50 


1 - 7 




F l/ F 3 " 9,90 


6-12 
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TABLE 24 Continued — 




White 


F l/F 2 - 4.30 


1 - 8 




F 2 /F 3 - 2.90 


2-6 




Fx/F 3 - 5.60 


0-11 


All Speakers 


Fx/F 2 - 4.85 


1 - 8 




F 2 /F 3 - 3.70 


1 - 7 




Fj/Fj - 7.75 


0-12 
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III. SUMMARY OF RESULTS 

1. Number of phonetic distortions was found to be significant in 
predicting when a recorded speech sample would be identified 
as having been performed by a Negro speaker; 

2. Number of phonetic distortions was found to be significant in 
predicting when a recorded speech sample would be identified as 
having been performed by a white speaker; 

3. The socioeconomic status score of the speaker and the articulatory 
product score of the speaker were found to be significant in pre- 
dicting the speech quality rating received by the speaker from 
listeners; 

4. The following independent variables included in the questions 
posed for this research in Chapter II were not found, through the 
regression analysis, to function significantly in listener perception 
of racial identity and quality rating of speakers (see F -ratios listed 
in Appendix D) : 

1 . Age; 

2 . Sex; 

3. Articulation errors; 

a. number of phonetic omissions, 

b. number of phonetic substitutions, 

5. Total number of misarticulated phonemes; 

6. Self-rating of speech proficiency. 
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5. No significant differences were found between the Negro and 
white means on formant frequencies of the (i) and (u) vowels; 

6. The relative formant amplitudes of Negro speakers on (i) and (u) 
were consistently lower than those of white speakers but this 
difference was not found to be significant. 
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CHAPTER V 

DISCUSSION AND SUMMARY 

The discussion will be divided into three sections! 1 • the 
findings of this study relative to informational content and social 
dialect analysis; 2. the results of the spectrographic analysis and; 

3. suggestions for further research. These sections will be followed 

by a summary of the study. 

I. INFORMATIONAL CONTENT AND SOCIAL DIALECT ANALYSIS 

It is believed that the most important results reported in the 
previous chapter are those contributed by the regression analysis. 
Apparently/ the number of phonetic distortions is significant in pre- 
dieting whether recorded speech samples will be identified as having 
been performed by a Negro or by a white speaker. This finding con- 
tributes basic information in the psychoacoustics of speech which 
may prove to be useful to school personnel and others interested in 
intergroup communication and socio— linguistic cues carried in speech 
signals. The strength of the prediction applies to all speakers, Negro 
and white. Apparently, number of phonetic distortions functioned 
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significantly in cases in which white speakers were identified as being 
Negro speakers and in cases in which Negro speakers were thought to 
be white speakers. 

Subsequent checking of the speaker data indicated that in all 
cases phonetic distortions applied to vowel sounds. This would mean 
that significant cues for racial perception in the recorded samples 
were related to vowel production. 

The finding that the articulatory product score is significant 
in predicting how white and Negro speakers will be rated by white and 
Negro listeners offers further validation of this instrument. The articu- 
latory product score reflects speaker performance on the factors of duration 
and whole word articulatory accuracy. The zero order correlations between 
these two factors and speaker rating were .21 for duration and .75 for 
total misarticulated words. The higher correlation between the AP score 
and total misarticulated words indicates that this factor contributed the 
greatest strength to the prediction relationship. This may provide information 
helpful in structuring dialect remediation programs. 

The socioeconomic status score combined occupation, income 
and education. Since the analysis did not distinguish these factors , it 
is not possible to estimate the relative strength each may have contributed 
to the prediction relationship. 
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II. SPECTROGRAPHIC ANALYSIS 

Ladefoged (29) says that the idiosyncratic features of speech 
signals which provide personal information about speakers may be 
attributed to anatomical and physical characteristics of the individual 
speaker such as the shape, size and coupling of resonance cavities of 
the vocal tract. Group features providing socio-linguistic information 
are, according to Ladefoged, most attributable to the influence of the 
particular groups in which the speaker is or has been a member. It can 
be inferred from this distinction that Ladefoged does not believe that 
the communication of socio-linguistic information is a function of vocal 
tract resonance characteristics which would make distinguishable 
differences in spectrographic displays. The results of the intergroup 
spectrographic analysis reported in Chapter IV would tend to confirm 
this opinion in reference to the particular acoustic features examined in 
this study. It should be pointed out, on the other hand, that both the 
number of features studied and sample size were limited in this phase 
of the research. 



o 
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III. SUGGESTION FOR FURTHER RESEARCH 

One of the recognized limitations of the study reported here 
is that the speech material used for analysis was acquired through 
use of a reading passage. Since this was a speech study that did 
not include non-phonological aspects of spoken language as variables 
to be investigated, it was believed that the advantages of controlled 
sampling outweighed the recognized disadvantages of using a reading 
passage for gathering speech material. In view of the possible quali- 
fication ot these results due to the restriction imposed by the reading 
passage, further research should be carried out to determine whether 
the reported relationship of variables would remain constant under 
condtions of extemporaneous speech material. 

The problems in social dialect and informational content 
analysis formulated for this particular study were limited to analysis 
of variables which might be found to predict the racial identification 
and rating of speakers . The social dialect analys is should be extended 
through continued research to further specify the relationship of variables 
found to be significant in this study. 

The spectrographic analysis reported here should be continued. 
Other phonemes and acoustic parameters should be investigated as part of 
research directed at a clearer understanding of perceptual cues in speech 
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signals. Such features as formant bandwidth/ interformant fill/ duration, 
and vowel transition characteristics may prove to be useful areas of 
inquiry in future attempts to specify acoustic correlates of racial 
perception. The finding reported in Chapter IV of consistently 
reduced formant amplitudes among the Negro speakers studied may also 
indicate the need for further investigation. 
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IV. SUMMARY 

This study used ninety-one subjects in an attempt to 
specify social and acoustic variables which function significantly 
in the racial identification and rating of Negro and white speakers 
by Negro and white listeners. Eighty-six subjects, forty -three 
white and forty-three Negro, provided the listener responses. 

Subjects were chosen to provide a sample approximately representative 
of the distribution of socioeconomic status scores in Southeastern 
United States. 

Listeners were asked to judge the race and overall speech 
proficiency of speakers from listening to a recorded reading passage. 
Comparative control was exercised over the quality ratings through 
the use of a semi-objective articulatory product score which pro- 
vided an independent index of speech proficiency. Additional inde- 
pendent variables included the socioeconomic status score; sex; age; 
number of articulation enors divided into substitutions, omissions and 
distroticns; r’;..iber of misarticulated phonemes and a self-rating of 
speech proficiency. Ail speaker and listener data were gathered under 
controlled laboratory conditions. Analysis was carried out through 
analysis of variance and co-variance using multiple regression technique 

to determine variables which might be significant in predicting racial 
identity perception and quality rating of speakers. 
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A spectrographic analysis was carried out using a sample 
of the sample consisting of ten Negro male and ten white male subjects. 
All speakers used in this analysis had been correctly identified by 
listeners as to race 95% of the time or better. 

The purpose of this phase of the study was to specify 
spectral data in the resonance characteristics of speakers as seen 
in two selected vowel sounds which might function significantly in 
listener perception of racial identity and the quality rating of speakers . 
An intergroup comparison was carried out on the acoustic variables 
of formant frequency and relative formant amplitude from spectrographic 
displays of the (i) and (u) vowels. 

The results can be summarized as follows: 

1 . Number of phonetic distortions is significant in predicting 
listener identification of the race of speakers from recorded 
speech samples. 

2. Socioeconomic status score and articulatory product score are 
significant factors in predicting speech quality ratings received 
by Negro and white speakers from Negro and white listeners. 

3. No significant intergroup differences were found in the comparison 

carried out on acoustic variables from spectrographic displays. 

The Negro speakers were found, however, to have consistently 
lower relative formant frequencies than the white group. 
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SWEET SU< 

swii 5 a j 



as 

321 



CAKE. OTHERS PURPOSE! 

Kekll A<$arz pypasl 



RESTRICT. THEMSELVES TO A 

rLStrikt fcamselvz tu. a 



GIASS OF 

qlas s 8V 



ORANGE JUICE. 

orrndZ| ojulsII 



SOME FREQUENTLY GO 

SAm frikw/eiotilr oo 



.x S 



WITH 

WI$ 


:OUT. A 

iauLt a 


MORNING 

mornin 


MEAL. 

mil 


WHO 

nu. 


EAT 

it> 


LIGHTLY 

lai-Dix 


LUNCH 

lAH-tJ 



DO THOSE 

da. $oz 




I WILL SAY BEET. 



I 



WILL 



SAY 



BOOT. 



H * 
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QUESTIONNAIRE 
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NUMBER 

DATE 

QUESTIONNAIRE 

v NOTE: All information on this form is kept strictly confidential) 



NAME 


DO NOT WRITE IN 
THIS SPACE 


ADDRESS 


h.ch. hearing 




sp.ch. speech 


AGE SEX: MALE ( ) , FEMALE ( ) 


e.b. ethnic 


TELEPHONE NUMBER 


l.b. linguistic 


EMPLOYED BY _ 


g.h. health 




so.tr. speech train. 


TELEPHONE AT WORK 


r.a. reading 



PLEASE FILL IN ALL OF THE FOLLOWING INFORMATION : 

( 1) OCCUPATION of the chief income recipient of the family 



( 2 ) 



EDUCATIO N of the chief income recipient of the family (please 
circle the highest year completed in s- ’'ool) 



Elementary 
High School 
College 



1 2 3 4 5 

12 3 4 

12 3 4 



6 7 8 










5 or more 



119 

(3) INCOME Total income for the whole family per year (please 
check correct answer) 



Loss, none or less that $500 




$500 to $999 


$6000 to $6499 


$1000 to $1499 


$6500 to $6999 


$1500 to $1999 


$7000 to $7499 


$2000 to $2499 


$7500 to $7999 


$2500 to $2999 


$8000 to $8499 


$3000 to $3499 


$8500 to $8999 


$3500 to $3999 


$9000 to $9499 


$4000 to $4499 


$9500 to $9999 


$4500 to $4999 


$10,000 to $14,999 


$5000 to $5499 


$15,000 to $24,999 


$5500 to $5999 


$25,000 or more 



(4) P LEASE CHECK ONE: 

I think I am a good speaker 
I think I am an average speaker 
I think I am a poor speaker 






APPENDIX C. 



LISTENER RESPONSE FORM 
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NUMBER 

DATE 



LISTENER RESPONSE FORM 

NAME 

INSTRUCTIONS: Please respond to EVERY question about EVERY 
speaker. Mark ONE answer in every block for each speaker even 
if you are undecided and must guess. You will have 15 seconds 
between each speaker to mark your form. Answer each question by 
putting a CIRCLE around your choice. Be sure to answer all 4 
questions about each speaker. 



SPEAKER 1 



(1) VOCAL QUALITY 


(2) SEX 


(circle one) 


(circle one) 


GOOD AVERAGE POOR 
SPEAKER SPE A KER SPEAKER 


MALE FEMALE 


(3) AGE 


i 

(4) ETHNIC BACKGROUND 


(circle one) 


(circle one) 


1 15-25 25-40 40-60 

L 


NEGRO WHITE 

1 



SPEAKER 2 



(etc.) 



APPENDIX D 



Key to Variables and Combinations of 
Variables Used in Full and Restricted Models 



Full Models Used in Analysis 



Tables of the Effects on Dependent 
Variables Attributable to the 
Independent Variables 












123 



KEY TO VARIABLES AND COMBINATIONS OF VARIABLES USED IN 
FULL AND RESTRICTED MODELS 





Predictor Variables 


Number 


Variable or Combination 


2 


Negro speaker 


3 


White speaker 


4 


Age of speaker 


5 


Male speaker 


6 


Female speaker 


7 


Socioeconomic status score (SES) 


8 


Articulatory product score (AP) 


9 


Quality rating received by speaker 


13 


Self-rating made by speaker 


16 


Accuracy in racial identification 


37 


Number of phonetic substitutions 


39 


Number of phonetic distortions 


40 


Number of misarticulated phonemes 


41 


Unit vector 


42 


2x7 


43 


3x7 


44 


2x8 


45 


3x8 


46 


2x9 


47 


3x9 
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2 x 16 

3 x 16 

2 x 37 

3 x 37 

2 x 38 

3 x 38 

2 x 39 

3 x 39 

2 x 40 

3 x 40 

if not zero, element in vector 44 minus mean 
of 47 persons in group, else zero 

if not zero, element in vector 45 minus mean 
of 44 persons in group, else zero 

if not zero, element in vector 46 minus mean 
of 47 persons in group, else zero 

if not zero, element in vector 47 minus mean 
of 44 persons in group, else zero 

if not zero, element in vector 48 minus mean 
of 47 persons in group, else zero 

if not zero, element in vector 49 minus mean 
of 44 persons in group, else zero 

if not zero, element in vector 50 minus mean 
of 47 persons in group, else zero 

if not zero, element in vector 51 minus mean 
of 44 persons in group, else zero 

if not zero, element in vector 52 minus mean 
of 47 persons in group, else zero 
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67 


if 


not 


zero, element in vector 53 


minus 


mean 




of 


44 


persons in group, else zero 






68 


if 


not 


zero, element in vector 54 


minus 


mean 




of 


47 


persons in group, else zero 






69 


if 


not 


zero, element in vector 55 


minus 


mean 




of 


44 


persons in group, else zero 






70 


if 


not 


zero, element in vector 56 


minus 


mean 




of 


47 


persons in group, else zero 






71 


if 


not 


zero, element in vector 57 


minus 


mean 




of 


44 


persons in group, else zero 






72 


58 


+ 


59 






73 


60 


+ 


61 






74 


62 


+ 


63 






75 


64 


+ 


65 






76 


66 


+ 


.67 






77 


68 


+ 


69 






78 


70 


•h 


71 







23 

29 

9 



Criterion Variables 

Times speaker perceived as Negro speaker 
Times speaker perceived as white speaker 
Quality rating received by speakers 



o 
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FULL MODELS USED IN ANALYSIS 



1. 


Criterion variable: 


23 
















(A) 


2-4 


(B) 


2 


- 8 


(C) 2 




7 


(D) 


2 




7 




15 - 15 




15 


- 15 


9 


- 


9 




15 


- 


16 




41 - 41 




41 


- 41 


15 


- 


15 




41 


- 


41 




72 - 78 




73 


- 78 


41 


- 


41 




72 


- 


73 












72 


- 


72 




75 


- 


78 












74 


- 


78 










(E) 


2-7 


(F) 


2 


- 7 


(G) 2 


— 


7 


(H) 


2 


— 


7 




15 - 15 




15 


- 15 


15 


- 


15 




15 


- 


15 




37 - 37 




38 


- 38 


39 


- 


39 




40 


- 


41 




41 - 41 




41 


- 41 


41 


- 


41 




72 


- 


77 




72 - 74 




72 


- 75 


72 


- 


76 












76 - 78 




77 


- 78 


78 


- 


78 










2. 


Criterion 


variable 


: 29 
















(A) 


2-7 


(B) 


2 


- 3 


(C) 2 


— 


7 


(D) 


2 


— 


7 




15 - 16 




15 


- 15 


9 


- 


9 




15 


- 


15 




41 - 41 




41 


- 41 


15 


- 


15 




37 


- 


37 




72 - 73 




73 


- 78 


41 


- 


41 




41 


- 


41 




75 - 78 








72 


- 


72 




72 


- 


74 












74 


- 


78 




76 


- 


78 


(E) 


2-7 


(F) 


2 


- 7 


(G) 2 


- 


/ 












15 - 15 




15 


- 15 


15 


- 


15 












38 - 38 




39 


- 39 


40 


- 


41 












41 - 41 




41 


- 41 


72 


- 


77 












72 - 75 




72 


- 76 


















77 - 78 




78 


- 78 
















3. 


Criterion 


variable 


: 9 
















(A) 


2-8 


(B) 


2 


- 7 


















15 - 16 




15 


- 15 


















37 - 38 




39 


- 39 


















40 - 41 




41 


- 41 


















77 - 77 




72 


- 76 






















78 


- 78 
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removed 14 .8199 (.70) 
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TABLE 28 



t -RATIOS RESULTING FROM COMPARISON OF MEANS 
OF SPECTROGRAPHIC VARIABLES 



Variable 


Difference Between 
the Means 


t -ratio 




(i) Vowel 




F i 


59 


2.02 


f 2 


146 


1.11 


F 3 


93 


.60 


F l/ F 2 


.80 


.40 


V F 3 


.20 


.29 


F l/ F 3 


1.00 


.85 




(u) Vowel 




F 1 


116 


1.25 


f 2 


320 


1.14 


F 3 


114 


.83 


F l /F 2 


1.10 


1. 10 


f 2 / F 3 


1.60 


2.08 


F l/ F 3 


4.30 


1.50 



NOTE: Critical value of t at 18 degrees of freedom =2.10 
for .05 level. 




APPENDIX F 



RATIONALE FOR ANALYSIS 
TAKEN FROM BOTTENBERG 



AND WARD (3) 



136 



Careful consideration of this outline in conjunction with 
Figures 1, 2, and 3 will disclose the logic governing the sequence in which estimates for 
restricted models should be obtained for any problem of this type. The numbers in parentheses 
refer to sections of the text which fully describe the analyses. 



Sequence of Tests of Hypotheses 




Mathematical 

Question Expression Analysis Answer Figure 

1. Is amount of change in ^3 = ^4 (5.2.4. 1) 

criterion per unit of 
concomitant variable 
the same for both treat- 
ments over observed 
range of concomitant 
vaiiable? 

Yes 1 

No 2 or 3 



Given & 3 = k 4 

2. Are the two treatments = k 2 , i.e., ( 5 . 2.4.2) 

equally effective over d 2 = 0 

observed range of the 
concomitant variable? 

Yes 

No 




Given k 3 4 £4 

3. At what point (a 0 ) on 
concomitant variable 
may both treatments be 
expected to be equally 
effective? 

Is a Q within range of 
interest? 



If a 0 is (5. 2. 4.3) 

estimate 

of m 0 (in 

Fig. 3), 




Yes 

No 



3 

2 



The flowchart in Figure 4 outlines the sequence of steps necessary for comparing the 
effects of two treatments when a concomitant variable may be operative. The principles that 
determine this sequence are applicable to problems involving several treatments and several 
concomitant variables. In such problems, however, there are more relationships possible be- 
tween the criterion and concomitant variables; and these relationships may differ from treatment 
to treatment. If the relationships do differ, any conclusion about the superiority of a treatment 
is contingent upon the range of values of the concomitant variables .‘hat arc considered simul- 
taneously. However, when the relationships can be shown to be constant from treatment to 
treatment, the determination of which one of several treatments is superior can be made by fol- 
lowing a sequence of steps analogous to that shown in Figure 4 for two-treatment problems. 
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abstract The purpose of this study was to specify variables which 
function significantly In the racial Identification and speech 
quality rating of Negro and white speakers by Negro and white 
listeners. An additional purpose was to specify any significant 
differences In vocal resonance characteristics between a group o. 
male speakers most often Identified by listeners as being Negro and 
a group of male speakers most often Identified as being white. 
Distribution of socioeconomic status scores within the sample was 
representative of the distribution of such scores In Southern U.S. 

Listeners were asked to Identify the race of each speaker and 
make a speech quality rating of recorded samples. The Articulatory 
Product (AP) score developed by Guttman was used as an independent, 
semi -ob jectl ve Index of speech proficiency. Resonance characteristics 
were studied through analysis of spectrographlc displays of the 
[1] and [u] vowels. 

Results: (1) Number of phonetic distortions by speakers 
predicts racial Identification by listeners; (2) Socioeconomic status 
score and Articulatory Product score predict speech quality rating of 
speakers by listeners; (3) No significant Intergroup differences were 
found on spectrographlc variables. Negro speakers used In acoustic 
analysis, however, had consistently greater attenuation of formant 
amplitudes of the [u] vowel than white speakers. 



